linux-next/fs
Kuniyuki Iwashima ef7134c7fc smb: client: Fix use-after-free of network namespace.
Recently, we got a customer report that CIFS triggers oops while
reconnecting to a server.  [0]

The workload runs on Kubernetes, and some pods mount CIFS servers
in non-root network namespaces.  The problem rarely happened, but
it was always while the pod was dying.

The root cause is wrong reference counting for network namespace.

CIFS uses kernel sockets, which do not hold refcnt of the netns that
the socket belongs to.  That means CIFS must ensure the socket is
always freed before its netns; otherwise, use-after-free happens.

The repro steps are roughly:

  1. mount CIFS in a non-root netns
  2. drop packets from the netns
  3. destroy the netns
  4. unmount CIFS

We can reproduce the issue quickly with the script [1] below and see
the splat [2] if CONFIG_NET_NS_REFCNT_TRACKER is enabled.

When the socket is TCP, it is hard to guarantee the netns lifetime
without holding refcnt due to async timers.

Let's hold netns refcnt for each socket as done for SMC in commit
9744d2bf19 ("smc: Fix use-after-free in tcp_write_timer_handler().").

Note that we need to move put_net() from cifs_put_tcp_session() to
clean_demultiplex_info(); otherwise, __sock_create() still could touch a
freed netns while cifsd tries to reconnect from cifs_demultiplex_thread().

Also, maybe_get_net() cannot be put just before __sock_create() because
the code is not under RCU and there is a small chance that the same
address happened to be reallocated to another netns.

[0]:
CIFS: VFS: \\XXXXXXXXXXX has not responded in 15 seconds. Reconnecting...
CIFS: Serverclose failed 4 times, giving up
Unable to handle kernel paging request at virtual address 14de99e461f84a07
Mem abort info:
  ESR = 0x0000000096000004
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
  FSC = 0x04: level 0 translation fault
Data abort info:
  ISV = 0, ISS = 0x00000004
  CM = 0, WnR = 0
[14de99e461f84a07] address between user and kernel address ranges
Internal error: Oops: 0000000096000004 [#1] SMP
Modules linked in: cls_bpf sch_ingress nls_utf8 cifs cifs_arc4 cifs_md4 dns_resolver tcp_diag inet_diag veth xt_state xt_connmark nf_conntrack_netlink xt_nat xt_statistic xt_MASQUERADE xt_mark xt_addrtype ipt_REJECT nf_reject_ipv4 nft_chain_nat nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 xt_comment nft_compat nf_tables nfnetlink overlay nls_ascii nls_cp437 sunrpc vfat fat aes_ce_blk aes_ce_cipher ghash_ce sm4_ce_cipher sm4 sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 sha1_ce ena button sch_fq_codel loop fuse configfs dmi_sysfs sha2_ce sha256_arm64 dm_mirror dm_region_hash dm_log dm_mod dax efivarfs
CPU: 5 PID: 2690970 Comm: cifsd Not tainted 6.1.103-109.184.amzn2023.aarch64 #1
Hardware name: Amazon EC2 r7g.4xlarge/, BIOS 1.0 11/1/2018
pstate: 00400005 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
pc : fib_rules_lookup+0x44/0x238
lr : __fib_lookup+0x64/0xbc
sp : ffff8000265db790
x29: ffff8000265db790 x28: 0000000000000000 x27: 000000000000bd01
x26: 0000000000000000 x25: ffff000b4baf8000 x24: ffff00047b5e4580
x23: ffff8000265db7e0 x22: 0000000000000000 x21: ffff00047b5e4500
x20: ffff0010e3f694f8 x19: 14de99e461f849f7 x18: 0000000000000000
x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000
x14: 0000000000000000 x13: 0000000000000000 x12: 3f92800abd010002
x11: 0000000000000001 x10: ffff0010e3f69420 x9 : ffff800008a6f294
x8 : 0000000000000000 x7 : 0000000000000006 x6 : 0000000000000000
x5 : 0000000000000001 x4 : ffff001924354280 x3 : ffff8000265db7e0
x2 : 0000000000000000 x1 : ffff0010e3f694f8 x0 : ffff00047b5e4500
Call trace:
 fib_rules_lookup+0x44/0x238
 __fib_lookup+0x64/0xbc
 ip_route_output_key_hash_rcu+0x2c4/0x398
 ip_route_output_key_hash+0x60/0x8c
 tcp_v4_connect+0x290/0x488
 __inet_stream_connect+0x108/0x3d0
 inet_stream_connect+0x50/0x78
 kernel_connect+0x6c/0xac
 generic_ip_connect+0x10c/0x6c8 [cifs]
 __reconnect_target_unlocked+0xa0/0x214 [cifs]
 reconnect_dfs_server+0x144/0x460 [cifs]
 cifs_reconnect+0x88/0x148 [cifs]
 cifs_readv_from_socket+0x230/0x430 [cifs]
 cifs_read_from_socket+0x74/0xa8 [cifs]
 cifs_demultiplex_thread+0xf8/0x704 [cifs]
 kthread+0xd0/0xd4
Code: aa0003f8 f8480f13 eb18027f 540006c0 (b9401264)

[1]:
CIFS_CRED="/root/cred.cifs"
CIFS_USER="Administrator"
CIFS_PASS="Password"
CIFS_IP="X.X.X.X"
CIFS_PATH="//${CIFS_IP}/Users/Administrator/Desktop/CIFS_TEST"
CIFS_MNT="/mnt/smb"
DEV="enp0s3"

cat <<EOF > ${CIFS_CRED}
username=${CIFS_USER}
password=${CIFS_PASS}
domain=EXAMPLE.COM
EOF

unshare -n bash -c "
mkdir -p ${CIFS_MNT}
ip netns attach root 1
ip link add eth0 type veth peer veth0 netns root
ip link set eth0 up
ip -n root link set veth0 up
ip addr add 192.168.0.2/24 dev eth0
ip -n root addr add 192.168.0.1/24 dev veth0
ip route add default via 192.168.0.1 dev eth0
ip netns exec root sysctl net.ipv4.ip_forward=1
ip netns exec root iptables -t nat -A POSTROUTING -s 192.168.0.2 -o ${DEV} -j MASQUERADE
mount -t cifs ${CIFS_PATH} ${CIFS_MNT} -o vers=3.0,sec=ntlmssp,credentials=${CIFS_CRED},rsize=65536,wsize=65536,cache=none,echo_interval=1
touch ${CIFS_MNT}/a.txt
ip netns exec root iptables -t nat -D POSTROUTING -s 192.168.0.2 -o ${DEV} -j MASQUERADE
"

umount ${CIFS_MNT}

[2]:
ref_tracker: net notrefcnt@000000004bbc008d has 1/1 users at
     sk_alloc (./include/net/net_namespace.h:339 net/core/sock.c:2227)
     inet_create (net/ipv4/af_inet.c:326 net/ipv4/af_inet.c:252)
     __sock_create (net/socket.c:1576)
     generic_ip_connect (fs/smb/client/connect.c:3075)
     cifs_get_tcp_session.part.0 (fs/smb/client/connect.c:3160 fs/smb/client/connect.c:1798)
     cifs_mount_get_session (fs/smb/client/trace.h:959 fs/smb/client/connect.c:3366)
     dfs_mount_share (fs/smb/client/dfs.c:63 fs/smb/client/dfs.c:285)
     cifs_mount (fs/smb/client/connect.c:3622)
     cifs_smb3_do_mount (fs/smb/client/cifsfs.c:949)
     smb3_get_tree (fs/smb/client/fs_context.c:784 fs/smb/client/fs_context.c:802 fs/smb/client/fs_context.c:794)
     vfs_get_tree (fs/super.c:1800)
     path_mount (fs/namespace.c:3508 fs/namespace.c:3834)
     __x64_sys_mount (fs/namespace.c:3848 fs/namespace.c:4057 fs/namespace.c:4034 fs/namespace.c:4034)
     do_syscall_64 (arch/x86/entry/common.c:52 arch/x86/entry/common.c:83)
     entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130)

Fixes: 26abe14379 ("net: Modify sk_alloc to not reference count the netns of kernel sockets.")
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Acked-by: Tom Talpey <tom@talpey.com>
Signed-off-by: Steve French <stfrench@microsoft.com>
2024-11-03 19:28:31 -06:00
..
9p Revert patches causing inode collision problems 2024-10-25 15:25:02 -07:00
adfs move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
affs affs-for-6.12-tag 2024-09-16 13:07:59 +02:00
afs vfs-6.12-rc6.fixes 2024-11-01 07:37:10 -10:00
autofs autofs: fix thinko in validate_dev_ioctl() 2024-10-28 13:16:56 +01:00
bcachefs bcachefs fixes for 6.12-rc6 2024-11-01 07:21:03 -10:00
befs befs: Convert befs_symlink_read_folio() to use folio_end_read() 2024-05-31 12:31:39 +02:00
bfs fs: Convert aops->write_begin to take a folio 2024-08-07 11:33:21 +02:00
btrfs for-6.12-rc5-tag 2024-11-01 07:31:47 -10:00
cachefiles cachefiles: fix dentry leak in cachefiles_open_file() 2024-09-27 18:29:19 +02:00
ceph A fix from Patrick for a variety of CephFS lockup scenarios caused by 2024-10-04 10:10:23 -07:00
coda coda: use param->file for FSCONFIG_SET_FD 2024-08-19 13:45:03 +02:00
configfs fs/configfs: Add a callback to determine attribute visibility 2024-06-17 20:42:57 +02:00
cramfs vfs-6.11.module.description 2024-07-15 11:14:59 -07:00
crypto move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
debugfs [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
devpts
dlm [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
ecryptfs move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
efivarfs [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
efs vfs-6.11.module.description 2024-07-15 11:14:59 -07:00
erofs erofs: use get_tree_bdev_flags() to avoid misleading messages 2024-10-21 14:30:27 +02:00
exfat move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
exportfs fhandle: relax open_by_handle_at() permission checks 2024-05-28 15:57:23 +02:00
ext2 vfs-6.12.file 2024-09-16 09:14:02 +02:00
ext4 ext4: fix off by one issue in alloc_flex_gd() 2024-10-04 17:36:28 -04:00
f2fs f2fs: allow parallel DIO reads 2024-10-11 15:12:07 +00:00
fat fat: fix uninitialized variable 2024-10-17 00:28:06 -07:00
freevxfs freevxfs: Convert freevxfs to the new mount API. 2024-03-26 09:04:53 +01:00
fuse fuse: remove stray debug line 2024-10-25 17:05:49 +02:00
gfs2 gfs2 changes 2024-09-23 11:55:17 -07:00
hfs fs: Convert aops->write_begin to take a folio 2024-08-07 11:33:21 +02:00
hfsplus move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
hostfs fs: Convert aops->write_begin to take a folio 2024-08-07 11:33:21 +02:00
hpfs move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
hugetlbfs fs: Convert aops->write_begin to take a folio 2024-08-07 11:33:21 +02:00
iomap vfs-6.12-rc6.iomap 2024-11-01 07:45:00 -10:00
isofs move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
jbd2 jbd2: remove unneeded check of ret in jbd2_fc_get_buf 2024-08-26 23:49:15 -04:00
jffs2 jffs2: Use a folio in jffs2_garbage_collect_dnode() 2024-08-19 13:40:00 +02:00
jfs jfs: Fix sanity check in dbMount 2024-10-22 09:40:37 -05:00
kernfs kernfs: mount: Remove unnecessary ‘NULL’ values from knparent 2024-05-04 19:02:39 +02:00
lockd move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
minix buffer: Convert __block_write_begin() to take a folio 2024-08-07 11:33:36 +02:00
netfs netfs: Downgrade i_rwsem for a buffered write 2024-10-17 15:33:42 +02:00
nfs NFS: remove revoked delegation from server's delegation list 2024-10-09 15:39:22 -04:00
nfs_common nfs_common: fix race in NFS calls to nfsd_file_put_local() and nfsd_serv_put() 2024-10-03 16:19:43 -04:00
nfsd nfsd-6.12 fixes: 2024-11-02 09:27:11 -10:00
nilfs2 nilfs2: fix potential deadlock with newly created symlinks 2024-10-30 20:14:12 -07:00
nls move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
notify inotify: Fix possible deadlock in fsnotify_destroy_mark 2024-10-02 15:14:29 +02:00
ntfs3 Changes for 6.12-rc3 2024-10-08 10:53:06 -07:00
ocfs2 ocfs2: pass u64 to ocfs2_truncate_inline maybe overflow 2024-10-28 21:40:40 -07:00
omfs fs: Convert aops->write_begin to take a folio 2024-08-07 11:33:21 +02:00
openpromfs openpromfs: add missing MODULE_DESCRIPTION() macro 2024-06-20 09:46:01 +02:00
orangefs move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
overlayfs fs: pass offset and result to backing_file end_write() callback 2024-10-16 13:17:45 +02:00
proc vfs-6.12-rc5.fixes 2024-10-21 10:48:24 -07:00
pstore drm next for 6.12-rc1 2024-09-19 10:18:15 +02:00
qnx4 qnx4: add MODULE_DESCRIPTION() 2024-05-28 11:52:53 +02:00
qnx6 qnx6: Convert directory handling to use kmap_local 2024-08-07 11:31:56 +02:00
quota \n 2024-09-23 10:49:28 -07:00
ramfs mm: switch mm->get_unmapped_area() to a flag 2024-04-25 20:56:25 -07:00
reiserfs move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
romfs romfs: fix romfs_read_folio() 2024-08-21 22:32:58 +02:00
smb smb: client: Fix use-after-free of network namespace. 2024-11-03 19:28:31 -06:00
squashfs Squashfs: fix variable overflow in squashfs_readpage_block 2024-10-30 20:14:12 -07:00
sysfs Merge 6.9-rc5 into driver-core-next 2024-04-23 13:27:43 +02:00
sysv buffer: Convert __block_write_begin() to take a folio 2024-08-07 11:33:36 +02:00
tests execve: Move KUnit tests to tests/ subdirectory 2024-07-22 18:25:47 -07:00
tracefs eventfs: Use list_del_rcu() for SRCU protected list variable 2024-09-05 10:18:48 -04:00
ubifs [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
udf udf: fix uninit-value use in udf_get_fileshortad 2024-10-02 14:32:37 +02:00
ufs ufs_rename(): fix bogus argument of folio_release_kmap() 2024-10-02 00:05:09 -04:00
unicode unicode: Don't special case ignorable code points 2024-10-09 13:34:01 -04:00
vboxsf fs: Convert aops->write_end to take a folio 2024-08-07 11:32:02 +02:00
verity fsverity: expose verified fsverity built-in signatures to LSMs 2024-08-20 14:03:18 -04:00
xfs XFS bug fies for 6.12-rc6 2024-11-02 09:22:16 -10:00
zonefs zonefs fixes for 6.12-rc2 2024-10-02 12:02:15 -07:00
aio.c fs/aio: Fix __percpu annotation of *cpu pointer in struct kioctx 2024-08-19 13:45:03 +02:00
anon_inodes.c fs: Create anon_inode_getfile_fmode() 2024-04-26 10:33:05 +02:00
attr.c nfsd-6.11 fixes: 2024-08-29 06:20:44 +12:00
backing-file.c fs: pass offset and result to backing_file end_write() callback 2024-10-16 13:17:45 +02:00
bad_inode.c
binfmt_elf_fdpic.c binfmt_elf_fdpic: fix AUXV size calculation when ELF_HWCAP2 is defined 2024-08-26 13:00:38 -07:00
binfmt_elf.c Revert "binfmt_elf, coredump: Log the reason of the failed core dumps" 2024-09-26 11:39:02 -07:00
binfmt_flat.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
binfmt_misc.c vfs-6.11.module.description 2024-07-15 11:14:59 -07:00
binfmt_script.c fs: binfmt: add missing MODULE_DESCRIPTION() macros 2024-05-28 12:06:51 +02:00
bpf_fs_kfuncs.c bpf: Add kfunc bpf_get_dentry_xattr() to read xattr from dentry 2024-08-07 11:26:54 -07:00
buffer.c vfs-6.12.folio 2024-09-16 08:54:30 +02:00
char_dev.c
compat_binfmt_elf.c
coredump.c Revert "binfmt_elf, coredump: Log the reason of the failed core dumps" 2024-09-26 11:39:02 -07:00
d_path.c
dax.c fsdax: dax_unshare_iter needs to copy entire blocks 2024-10-07 13:51:47 +02:00
dcache.c vfs-6.12.misc 2024-09-16 08:35:09 +02:00
direct-io.c fs/direct-io: Remove linux/prefetch.h include 2024-08-19 13:45:02 +02:00
drop_caches.c sysctl: treewide: constify the ctl_table argument of proc_handlers 2024-07-24 20:59:29 +02:00
eventfd.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
eventpoll.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
exec.c ALong with the usual shower of singleton patches, notable patch series in 2024-09-21 07:29:05 -07:00
fcntl.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
fhandle.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
file_table.c slab updates for 6.12 2024-09-18 08:53:53 +02:00
file.c close_range(): fix the logics in descriptor table trimming 2024-09-29 21:52:29 -04:00
filesystems.c
fs_context.c
fs_parser.c fs_parse: add uid & gid option option parsing helpers 2024-07-02 06:20:49 +02:00
fs_pin.c
fs_struct.c
fs_types.c
fs-writeback.c inode: port __I_SYNC to var event 2024-08-30 08:22:39 +02:00
fsopen.c [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
init.c
inode.c bcachefs: do not use PF_MEMALLOC_NORECLAIM 2024-10-09 12:47:18 -07:00
internal.h file: reclaim 24 bytes from f_owner 2024-08-28 13:05:39 +02:00
ioctl.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
Kconfig nfs_common: fix Kconfig for NFS_COMMON_LOCALIO_SUPPORT 2024-10-03 16:19:51 -04:00
Kconfig.binfmt exec: Add KUnit test for bprm_stack_limits() 2024-06-19 13:13:55 -07:00
kernel_read_file.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
libfs.c vfs-6.12.folio 2024-09-16 08:54:30 +02:00
locks.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
Makefile bpf: introduce new VFS based BPF kfuncs 2024-08-06 09:01:41 -07:00
mbcache.c
mnt_idmapping.c fuse update for 6.12 2024-09-24 15:29:42 -07:00
mount.h vfs-6.12.mount 2024-09-16 11:15:26 +02:00
mpage.c buffer: Remove calls to set and clear the folio error flag 2024-05-31 12:31:43 +02:00
namei.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
namespace.c fs: don't try and remove empty rbtree node 2024-10-17 15:33:43 +02:00
nsfs.c [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
open.c openat2: explicitly return -E2BIG for (usize > PAGE_SIZE) 2024-10-10 12:09:03 +02:00
pidfs.c pidfs: check for valid pid namespace 2024-09-27 18:29:19 +02:00
pipe.c [tree-wide] finally take no_llseek out 2024-09-27 08:18:43 -07:00
pnode.c
pnode.h
posix_acl.c fs: Use in_group_or_capable() helper to simplify the code 2024-08-30 08:22:37 +02:00
proc_namespace.c fs: rename show_mnt_opts -> show_vfsmnt_opts 2024-06-28 14:36:43 +02:00
read_write.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
readdir.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
remap_range.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
select.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
seq_file.c seq_file: Simplify __seq_puts() 2024-05-02 16:28:20 +02:00
signalfd.c struct fd layout change (and conversion to accessor helpers) 2024-09-23 09:35:36 -07:00
splice.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
stack.c
stat.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
statfs.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
super.c fs/super.c: introduce get_tree_bdev_flags() 2024-10-21 14:30:26 +02:00
sync.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
sysctls.c
timerfd.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
userfaultfd.c fork: do not invoke uffd on fork if error occurs 2024-10-28 21:40:38 -07:00
utimes.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00
xattr.c introduce fd_file(), convert all accessors to it. 2024-08-12 22:00:43 -04:00