Commit Graph

1185223 Commits

Author SHA1 Message Date
Linus Torvalds
849a4f0973 xfs: bug fixes for 6.4-rc2
o fixes for inode garbage collection shutdown racing with work queue
   updates
 o ensure inodegc workers run on the CPU they are supposed to
 o disable counter scrubbing until we can exclusively freeze the
   filesystem from the kernel
 o Regression fixes for new allocation related bugs
 o a couple of minor cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCgAyFiEEmJOoJ8GffZYWSjj/regpR/R1+h0FAmRcSIsUHGRhdmlkQGZy
 b21vcmJpdC5jb20ACgkQregpR/R1+h2Y8xAAxtsTdOx71XtDuNyfBOiqzZgTCq6b
 6LsckJIDQa1AXjUNq9G3zWcUcWBcRWcw+CWbkqjqQ9W47K/ijLuoKnjRsQ+5B4DU
 TBUctVq+/Zk2lBlb6HKuKdzqDGnIFWGVKVd7u8KlowqnXuzUeQ0vFkT7ZHTepUKG
 P+midgGNVT4+tykq7oH0H8WxoTyNPZhKiAUcZjneBgA60IAoQWHA2iUt+SKpbrkL
 1HyK+/edVMTXiDXtyHfXmDaH9Pgy6NCpw3TNkPDhuL1UDpLhg/zgT39rFZGBsAUt
 gaDM3wN5jBrot/mvJE3rH9bdZhkcf+NQKPx/1DDg3DL8plS/1/LUC4cImdolBJ3w
 RNmgJv1lK+AlE4MUJ/bUDlEpHUmwAjnnsxBXwEvnYNfj+9V6/mDB+HqKiY7/XxVK
 vF77s6z+CWvefdnZavJ4/72pVVJNkcDYCYmvh/donRP6vtnwZyzocFUeBeNMInV1
 /s3WMrF9hwmJqAClKG7p1fnszWp658yFIuw/TXVs+NrjTtQgXwMpl2cEYYvUZEJN
 Trq2p0xH/JSwcnOPSPJO6WHb8UPoqrM6lgGFaJVWJx1AWt1i1CFLf5eA5X+XisDV
 AJKgpqlnDg02bBMQ0tMFGZUaNx/1S1mwtxcZsyEFTutpUNxqJKDaMohpxxrWb0WC
 ppSqDvyJN4wtlFI=
 =qok2
 -----END PGP SIGNATURE-----

Merge tag 'xfs-6.4-rc1-fixes' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux

Pull xfs bug fixes from Dave Chinner:
 "Largely minor bug fixes and cleanups, th emost important of which are
  probably the fixes for regressions in the extent allocation code:

   - fixes for inode garbage collection shutdown racing with work queue
     updates

   - ensure inodegc workers run on the CPU they are supposed to

   - disable counter scrubbing until we can exclusively freeze the
     filesystem from the kernel

   - regression fixes for new allocation related bugs

   - a couple of minor cleanups"

* tag 'xfs-6.4-rc1-fixes' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
  xfs: fix xfs_inodegc_stop racing with mod_delayed_work
  xfs: disable reaping in fscounters scrub
  xfs: check that per-cpu inodegc workers actually run on that cpu
  xfs: explicitly specify cpu when forcing inodegc delayed work to run immediately
  xfs: fix negative array access in xfs_getbmap
  xfs: don't allocate into the data fork for an unshare request
  xfs: flush dirty data and drain directios before scrubbing cow fork
  xfs: set bnobt/cntbt numrecs correctly when formatting new AGs
  xfs: don't unconditionally null args->pag in xfs_bmap_btalloc_at_eof
2023-05-11 16:51:11 -05:00
Linus Torvalds
105131df9c Devicetree binding fixes for v6.4
Few fixes for Devicetree bindings and related docs, all for issues
 introduced in v6.4-rc1 commits:
 
 1. media/ov2685: fix number of possible data lanes, as old binding
    explicitly mentioned one data lane. This fixes dt_binding_check
    warnings like:
    Documentation/devicetree/bindings/media/rockchip-isp1.example.dtb: camera@3c: port:endpoint:data-lanes: [[1]] is too short
    From schema: Documentation/devicetree/bindings/media/i2c/ovti,ov2685.yaml
 
 2. Maintainers: correct path of Apple PWM binding. This fixes
    refcheckdocs warning.
 
 3. PCI/fsl,imx6q: correct parsing of assigned-clocks and related
    properties and make the clocks more specific per PCI device (host or
    endpoint).  This fixes dtschema limitation and dt_binding_check
    warnings like:
 
    Documentation/devicetree/bindings/pci/fsl,imx6q-pcie-ep.example.dtb: pcie-ep@33800000: Unevaluated properties are not allowed
    From schema: Documentation/devicetree/bindings/pci/fsl,imx6q-pcie-ep.yaml
 -----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCgAuFiEE3dJiKD0RGyM7briowTdm5oaLg9cFAmRcxQIQHGtyemtAa2Vy
 bmVsLm9yZwAKCRDBN2bmhouD158qD/0bBKNr0Ydj9HsobdAsGnkIUIpGgl2TktxB
 tvOgbb4ppD3I4HAbh+AobXsB3zILnC1nv9Vky7jRxty9CquDotzRsedBhfih00fK
 2lU5eClF1tX2dFNU+yHeLmdQxgocxfGvpvO8GZWpL5f/sZWYsFkuBiOedrFeqbLZ
 6Sj231lW71rJj884HoeA/vLEQJXf4peHTGIfw5xQuSGninaForendT7fsZh9Ij4d
 MdNEnA3irxypDWbuHiEVKwv/Fbgp9RTomgi9+MfaecmcH1Z0zTbRn/AzGYiehIgw
 IeJA25HkOaiHNARyczQI2PzSPuFslFVT8vc7CtAgaX13NcJ01gbQDWm00W8tP5GX
 hih3ky50GImarwhSFizmv5pxsGXc9FaPAC5WMgXMADqFP2equzYngKlDcXBnKkiw
 hOByMJs9aGe2/fksExeV4OS3oBDLo3KTzvJ5biWNkcCaHxvNQT2fNPOMYuDxew1S
 qvFgcxc4b/DkH73yO7Jp0wOAbUwAfSxD6CX06AyKl64OOZ852mtS8R51G0+xHQcn
 bJ1uWgf9+TSlSZAUhdmcbkLRNTk+zM6T11hWoNB+hIr5FN4/KxteEsE5l0YvLMzo
 T7cTrrXBee1MuU33x6GbLybsewBZMOAnXm64sqpua6DyfuBY1kBUjzL9mTc6PaB/
 C4C6fY1Bew==
 =iuNU
 -----END PGP SIGNATURE-----

Merge tag 'dt-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-dt

Pull devicetree binding fixes from Krzysztof Kozlowski:
 "A few fixes for Devicetree bindings and related docs, all for issues
  introduced in v6.4-rc1 commits:

   - media/ov2685: fix number of possible data lanes, as old binding
     explicitly mentioned one data lane. This fixes dt_binding_check
     warnings like:

       Documentation/devicetree/bindings/media/rockchip-isp1.example.dtb: camera@3c: port:endpoint:data-lanes: [[1]] is too short
       From schema: Documentation/devicetree/bindings/media/i2c/ovti,ov2685.yaml

   - PCI/fsl,imx6q: correct parsing of assigned-clocks and related
     properties and make the clocks more specific per PCI device (host
     or endpoint). This fixes dtschema limitation and dt_binding_check
     warnings like:

       Documentation/devicetree/bindings/pci/fsl,imx6q-pcie-ep.example.dtb: pcie-ep@33800000: Unevaluated properties are not allowed
       From schema: Documentation/devicetree/bindings/pci/fsl,imx6q-pcie-ep.yaml

   - Maintainers: correct path of Apple PWM binding. This fixes
     refcheckdocs warning"

* tag 'dt-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-dt:
  dt-bindings: PCI: fsl,imx6q: fix assigned-clocks warning
  MAINTAINERS: adjust file entry for ARM/APPLE MACHINE SUPPORT
  media: dt-bindings: ov2685: Correct data-lanes attribute
2023-05-11 09:01:40 -05:00
Linus Torvalds
6e27831b91 Networking fixes for 6.4-rc2, including fixes from netfilter
Current release - regressions:
 
   - mtk_eth_soc: fix NULL pointer dereference
 
 Previous releases - regressions:
 
   - core:
     - skb_partial_csum_set() fix against transport header magic value
     - fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().
     - annotate sk->sk_err write from do_recvmmsg()
     - add vlan_get_protocol_and_depth() helper
 
   - netlink: annotate accesses to nlk->cb_running
 
   - netfilter: always release netdev hooks from notifier
 
 Previous releases - always broken:
 
   - core: deal with most data-races in sk_wait_event()
 
   - netfilter: fix possible bug_on with enable_hooks=1
 
   - eth: bonding: fix send_peer_notif overflow
 
   - eth: xpcs: fix incorrect number of interfaces
 
   - eth: ipvlan: fix out-of-bounds caused by unclear skb->cb
 
   - eth: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmRcxawSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkv6wQAJgfOBlDAkZNKHzwtMuFiLxECeEMWY9h
 wJCyiq0qXnz9p5ZqjdmTmA8B+jUp9VkpgN5Z3lid5hXDfzDrvXL1KGZW4pc4ooz9
 GUzrp0EUzO5UsyrlZRS9vJ9mbCGN5M1ZWtWH93g8OzGJPRnLs0Q/Tr4IFTBVKzVb
 GmJPy/ZYWYDjnvx3BgewRDuYeH3Rt9lsIt4Pxq/E+D8W3ypvVM0m3GvrO5eEzMeu
 EfeilAdmJGJUufeoGguKt0hheqILS3kNCjQO25XS2Lq1OqetnR/wqTwXaaVxL2du
 Eb2ca7wKkihDpl2l8bQ3ss6vqM0HEpZ63Y2PJaNBS8ASdLsMq4n2L6j2JMfT8hWY
 RG3nJS7F2UFLyYmCJjNL1/I+Z9XeMyFKnHORzHK1dAkMlhd+8NauKWAxdjlxMbxX
 p1msyTl54bG0g6FrU/zAirCWNAAZYCPdZG/XvA/2Jj9mdy64OlGlv/QdJvfjcx+C
 L6nkwZfwXU7QUwKeeTfP8abte2SLrXIxkJrnNEAntPnFOSmd16+/yvQ8JVlbWTMd
 JugJrSAIxjOglIr/1fsnUuV+Ab+JDYQv/wkoyzvtcY2tjhTAHzgmTwwSfeYiCTJE
 rEbjyVvVgMcLTUIk/R9QC5/k6nX/7/KRDHxPOMBX4boOsuA0ARVjzt8uKRvv/7cS
 dRV98RwvCKvD
 =MoPD
 -----END PGP SIGNATURE-----

Merge tag 'net-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from netfilter.

  Current release - regressions:

   - mtk_eth_soc: fix NULL pointer dereference

  Previous releases - regressions:

   - core:
      - skb_partial_csum_set() fix against transport header magic value
      - fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().
      - annotate sk->sk_err write from do_recvmmsg()
      - add vlan_get_protocol_and_depth() helper

   - netlink: annotate accesses to nlk->cb_running

   - netfilter: always release netdev hooks from notifier

  Previous releases - always broken:

   - core: deal with most data-races in sk_wait_event()

   - netfilter: fix possible bug_on with enable_hooks=1

   - eth: bonding: fix send_peer_notif overflow

   - eth: xpcs: fix incorrect number of interfaces

   - eth: ipvlan: fix out-of-bounds caused by unclear skb->cb

   - eth: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register"

* tag 'net-6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (31 commits)
  af_unix: Fix data races around sk->sk_shutdown.
  af_unix: Fix a data race of sk->sk_receive_queue->qlen.
  net: datagram: fix data-races in datagram_poll()
  net: mscc: ocelot: fix stat counter register values
  ipvlan:Fix out-of-bounds caused by unclear skb->cb
  docs: networking: fix x25-iface.rst heading & index order
  gve: Remove the code of clearing PBA bit
  tcp: add annotations around sk->sk_shutdown accesses
  net: add vlan_get_protocol_and_depth() helper
  net: pcs: xpcs: fix incorrect number of interfaces
  net: deal with most data-races in sk_wait_event()
  net: annotate sk->sk_err write from do_recvmmsg()
  netlink: annotate accesses to nlk->cb_running
  kselftest: bonding: add num_grat_arp test
  selftests: forwarding: lib: add netns support for tc rule handle stats get
  Documentation: bonding: fix the doc of peer_notif_delay
  bonding: fix send_peer_notif overflow
  net: ethernet: mtk_eth_soc: fix NULL pointer dereference
  selftests: nft_flowtable.sh: check ingress/egress chain too
  selftests: nft_flowtable.sh: monitor result file sizes
  ...
2023-05-11 08:42:47 -05:00
Linus Torvalds
691e1eee1b media fixes for v6.4-rc2
-----BEGIN PGP SIGNATURE-----
 
 iQIyBAABCAAdFiEE+QmuaPwR3wnBdVwACF8+vY7k4RUFAmRcg4gACgkQCF8+vY7k
 4RWPGA/0CbjmuSvikL7h2Cjlusp1HAvjhcHyXElKuob87ZlMq248eEQjwmeYkefG
 Oc1q3dHJR52BHwMH/9RhRfwKxFcSjKtb3Ed8R492OxZgIg9urrpb8YA4Z1r4qMaU
 Ehd3AS1a8Fe8pLWNxG4cxcs7Nl5giMTnBORMsxcaLx4miRwkfoNWBCY7L50yjOGA
 l2OYht6sqZMI+CPc4I+C7cgAVCflJBnIGGBoKZI+HSRdMpN1Ndown/O2vc5bZbGh
 mw2MKgMr3+QUxjMm1kebHFeYIgLtg2p95/kjlQlMEqgQiIdhGIMpv3KqZjeRqZZW
 4WAawUMwZEmPXV01OgQtQQT79LYZYVLTxJewm8nRMBLsBbbYNCLLExzVdfY2cxxt
 aXHYnmLsL2wIicQIdNtq9NuptlVo6+zJpBpFQR5BfXT2VhBT/JF7iB7y5m6B269u
 Ah5Sv1ygB6JDrkxxPzdsRSavwxT4tMx5bzIUUvcg9D9AyfXgF/jxxK6MYuALWOv0
 YUZD1wH4XqoRKITqTfts8kvu9+jDkh9ruBHBwDM/i+v9CNyIAQaU32Yolit8IGgx
 ArrDiGigzbD2Z9e8NVZFJy+15PQgpCUBOUagU5w/eszhBo25n7N2a8EjNSHQEzYb
 7IHHVzjZzizhMLdJSWNAFolgmiL+DcZnuhC79rep+EdG7FIp1Q==
 =XxmV
 -----END PGP SIGNATURE-----

Merge tag 'media/v6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media

Pull media fixes from Mauro Carvalho Chehab:

 - fix some unused-variable warning in mtk-mdp3

 - ignore unused suspend operations in nxp

 - some driver fixes in rcar-vin

* tag 'media/v6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media:
  media: platform: mtk-mdp3: work around unused-variable warning
  media: nxp: ignore unused suspend operations
  media: rcar-vin: Select correct interrupt mode for V4L2_FIELD_ALTERNATE
  media: rcar-vin: Fix NV12 size alignment
  media: rcar-vin: Gen3 can not scale NV12
2023-05-11 08:35:52 -05:00
Jakub Kicinski
cceac92678 netfilter pull request 23-05-10
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmRbUnQACgkQ1V2XiooU
 IOQNVw//eoCiid3J4TuNdzHaBHXUlZvln1n5Z6K5fIz5ytrY1T7oKC8Y9StkzzWR
 29ToFLOJt1iRAcgxsghWRIwUzNwpuUdqgd9cUZMHMQxT0BJItp6FXUql2+1LkF/I
 b/gnnb90zyE7lBS/VSRyOiqMiJlP+Som22d7Nn5k2KfTYEdXKwfzjsWAu3W3Sb0s
 Lv/MA9DE42qcwiZubmFmDtOtAunPJFZm3HgkcAVeXoNkBDrSfkvxLIMYG6VfFNhQ
 AkKMyzX293wpwVxfOuQfJr4QVlxAgOQUko+FqajoWMBtfA3yldZjJ8RC7c9Af9uI
 ciOP11vHBCG84KrTabC5kdqOcvadreDiM/oIvk57ztQhCr3e+po+vIz6Cv4p9I30
 m5GXfgbtMRl6hM2S5lrRc5fNRkYJHE4aNvesFTGaLpK3LogusH1E9mH5jdjnbU42
 TwkIe250qJJelNn9ZxS5Jt0BgyogNfeiA9lOXmaQBpYmwIahrjDf0g8z2I85nPhF
 PDukjSXsxi4uHwpSF5wFrlqkAPiEX4vC95uSUTVbbBgZrHhNJdGgu4FJSHRM4mVo
 6Awxk2O2bvcDpXEfTBdD4EJF71bZ/aH2i8ddU1oupIl88O01TWTtkO4SYHqMX+tC
 fPnpGzBN8KJvHgeFxO4p0v1oelv2RtiITv1gi7YJOnq/+Vd2yy0=
 =HmfP
 -----END PGP SIGNATURE-----

Merge tag 'nf-23-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter updates for net

The following patchset contains Netfilter fixes for net:

1) Fix UAF when releasing netnamespace, from Florian Westphal.

2) Fix possible BUG_ON when nf_conntrack is enabled with enable_hooks,
   from Florian Westphal.

3) Fixes for nft_flowtable.sh selftest, from Boris Sukholitko.

4) Extend nft_flowtable.sh selftest to cover integration with
   ingress/egress hooks, from Florian Westphal.

* tag 'nf-23-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  selftests: nft_flowtable.sh: check ingress/egress chain too
  selftests: nft_flowtable.sh: monitor result file sizes
  selftests: nft_flowtable.sh: wait for specific nc pids
  selftests: nft_flowtable.sh: no need for ps -x option
  selftests: nft_flowtable.sh: use /proc for pid checking
  netfilter: conntrack: fix possible bug_on with enable_hooks=1
  netfilter: nf_tables: always release netdev hooks from notifier
====================

Link: https://lore.kernel.org/r/20230510083313.152961-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-10 19:08:58 -07:00
Jakub Kicinski
33dcee99e0 Merge branch 'af_unix-fix-two-data-races-reported-by-kcsan'
Kuniyuki Iwashima says:

====================
af_unix: Fix two data races reported by KCSAN.

KCSAN reported data races around these two fields for AF_UNIX sockets.

  * sk->sk_receive_queue->qlen
  * sk->sk_shutdown

Let's annotate them properly.
====================

Link: https://lore.kernel.org/r/20230510003456.42357-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-10 19:06:55 -07:00
Kuniyuki Iwashima
e1d09c2c2f af_unix: Fix data races around sk->sk_shutdown.
KCSAN found a data race around sk->sk_shutdown where unix_release_sock()
and unix_shutdown() update it under unix_state_lock(), OTOH unix_poll()
and unix_dgram_poll() read it locklessly.

We need to annotate the writes and reads with WRITE_ONCE() and READ_ONCE().

BUG: KCSAN: data-race in unix_poll / unix_release_sock

write to 0xffff88800d0f8aec of 1 bytes by task 264 on cpu 0:
 unix_release_sock+0x75c/0x910 net/unix/af_unix.c:631
 unix_release+0x59/0x80 net/unix/af_unix.c:1042
 __sock_release+0x7d/0x170 net/socket.c:653
 sock_close+0x19/0x30 net/socket.c:1397
 __fput+0x179/0x5e0 fs/file_table.c:321
 ____fput+0x15/0x20 fs/file_table.c:349
 task_work_run+0x116/0x1a0 kernel/task_work.c:179
 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline]
 exit_to_user_mode_loop kernel/entry/common.c:171 [inline]
 exit_to_user_mode_prepare+0x174/0x180 kernel/entry/common.c:204
 __syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
 syscall_exit_to_user_mode+0x1a/0x30 kernel/entry/common.c:297
 do_syscall_64+0x4b/0x90 arch/x86/entry/common.c:86
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

read to 0xffff88800d0f8aec of 1 bytes by task 222 on cpu 1:
 unix_poll+0xa3/0x2a0 net/unix/af_unix.c:3170
 sock_poll+0xcf/0x2b0 net/socket.c:1385
 vfs_poll include/linux/poll.h:88 [inline]
 ep_item_poll.isra.0+0x78/0xc0 fs/eventpoll.c:855
 ep_send_events fs/eventpoll.c:1694 [inline]
 ep_poll fs/eventpoll.c:1823 [inline]
 do_epoll_wait+0x6c4/0xea0 fs/eventpoll.c:2258
 __do_sys_epoll_wait fs/eventpoll.c:2270 [inline]
 __se_sys_epoll_wait fs/eventpoll.c:2265 [inline]
 __x64_sys_epoll_wait+0xcc/0x190 fs/eventpoll.c:2265
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

value changed: 0x00 -> 0x03

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 222 Comm: dbus-broker Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014

Fixes: 3c73419c09 ("af_unix: fix 'poll for write'/ connected DGRAM sockets")
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-10 19:06:53 -07:00
Kuniyuki Iwashima
679ed006d4 af_unix: Fix a data race of sk->sk_receive_queue->qlen.
KCSAN found a data race of sk->sk_receive_queue->qlen where recvmsg()
updates qlen under the queue lock and sendmsg() checks qlen under
unix_state_sock(), not the queue lock, so the reader side needs
READ_ONCE().

BUG: KCSAN: data-race in __skb_try_recv_from_queue / unix_wait_for_peer

write (marked) to 0xffff888019fe7c68 of 4 bytes by task 49792 on cpu 0:
 __skb_unlink include/linux/skbuff.h:2347 [inline]
 __skb_try_recv_from_queue+0x3de/0x470 net/core/datagram.c:197
 __skb_try_recv_datagram+0xf7/0x390 net/core/datagram.c:263
 __unix_dgram_recvmsg+0x109/0x8a0 net/unix/af_unix.c:2452
 unix_dgram_recvmsg+0x94/0xa0 net/unix/af_unix.c:2549
 sock_recvmsg_nosec net/socket.c:1019 [inline]
 ____sys_recvmsg+0x3a3/0x3b0 net/socket.c:2720
 ___sys_recvmsg+0xc8/0x150 net/socket.c:2764
 do_recvmmsg+0x182/0x560 net/socket.c:2858
 __sys_recvmmsg net/socket.c:2937 [inline]
 __do_sys_recvmmsg net/socket.c:2960 [inline]
 __se_sys_recvmmsg net/socket.c:2953 [inline]
 __x64_sys_recvmmsg+0x153/0x170 net/socket.c:2953
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

read to 0xffff888019fe7c68 of 4 bytes by task 49793 on cpu 1:
 skb_queue_len include/linux/skbuff.h:2127 [inline]
 unix_recvq_full net/unix/af_unix.c:229 [inline]
 unix_wait_for_peer+0x154/0x1a0 net/unix/af_unix.c:1445
 unix_dgram_sendmsg+0x13bc/0x14b0 net/unix/af_unix.c:2048
 sock_sendmsg_nosec net/socket.c:724 [inline]
 sock_sendmsg+0x148/0x160 net/socket.c:747
 ____sys_sendmsg+0x20e/0x620 net/socket.c:2503
 ___sys_sendmsg+0xc6/0x140 net/socket.c:2557
 __sys_sendmmsg+0x11d/0x370 net/socket.c:2643
 __do_sys_sendmmsg net/socket.c:2672 [inline]
 __se_sys_sendmmsg net/socket.c:2669 [inline]
 __x64_sys_sendmmsg+0x58/0x70 net/socket.c:2669
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

value changed: 0x0000000b -> 0x00000001

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 49793 Comm: syz-executor.0 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-10 19:06:53 -07:00
Eric Dumazet
5bca1d081f net: datagram: fix data-races in datagram_poll()
datagram_poll() runs locklessly, we should add READ_ONCE()
annotations while reading sk->sk_err, sk->sk_shutdown and sk->sk_state.

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Link: https://lore.kernel.org/r/20230509173131.3263780-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-10 19:06:49 -07:00
Linus Torvalds
80e62bc848 MAINTAINERS: re-sort all entries and fields
It's been a few years since we've sorted this thing, and the end result
is that we've added MAINTAINERS entries in the wrong order, and a number
of entries have their fields in non-canonical order too.

So roll this boulder up the hill one more time by re-running

   ./scripts/parse-maintainers.pl --order

on it.

This file ends up being fairly painful for merge conflicts even
normally, since unlike almost all other kernel files it's one of those
"everybody touches the same thing", and re-ordering all entries is only
going to make that worse.  But the alternative is to never do it at all,
and just let it all rot..

The rc2 week is likely the quietest and least painful time to do this.

Requested-by: Randy Dunlap <rdunlap@infradead.org>
Requested-by: Joe Perches <joe@perches.com>	# "Please use --order"
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-05-10 20:04:04 -05:00
Linus Torvalds
d295b66a7b \n
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmRcFLAACgkQnJ2qBz9k
 QNmSuQf+P+hFOHVpCb83S4TrNsfPC9M7Pv/M1Rw926p1gESNLez1ElAx2B35OIBZ
 y4ijqrSHgXbCsC3PvbxR9YfBPJZwIhtmKwjPC+9KlhAFB536Bvt/nt34fjRCWJiS
 nABnexegkirk7EQufAGVZQ1gtm1p0qSCA1B/l5AFl4KeLIin31U/08F2eE1OAo3K
 TmxhXzlC+JRXjf/X/X8mzJ8nz/hY5039BDvs7EpVdOsZcLFRQcCUygUCrOdV94uI
 1wyzMoGNaiIsZHpHPf3NtphUoE9IFpZQJGXMTR4ehBgFYkkOHYxsjW+wn8YOmaks
 qzRRXb9w0WdgAXChDGu1xYXV6Wpjiw==
 =kqSe
 -----END PGP SIGNATURE-----

Merge tag 'fsnotify_for_v6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs

Pull inotify fix from Jan Kara:
 "A fix for possibly reporting invalid watch descriptor with inotify
  event"

* tag 'fsnotify_for_v6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
  inotify: Avoid reporting event with invalid wd
2023-05-10 17:07:42 -05:00
Linus Torvalds
2a78769da3 gfs2 fix
- Fix a NULL pointer dereference when mounting corrupted filesystems.
 -----BEGIN PGP SIGNATURE-----
 
 iQJIBAABCAAyFiEEJZs3krPW0xkhLMTc1b+f6wMTZToFAmRbtSYUHGFncnVlbmJh
 QHJlZGhhdC5jb20ACgkQ1b+f6wMTZTrQSg/6A7xD25DpU1CSL18CEKMxnViXPNLP
 IYiTtkByCsRkgQKR4KMtVgPD/lEyWbhpuZJzSF4mZkslpnngjL/o+Rj4Sds9GvPL
 +66CSJAqjpJUjgGDvAoiJcZYOejRyqZLCHzKF54h0oJ3k9JCd3xN86EX8vKGoc2e
 F/As+E1Wnf0PP4jiERDyFhhg95MNxFuNVJ/JJ7mwtWpumx+60SpGzzUp575AHcoe
 4zHkVHlj6FAz2G8l8NADYAc0wMuhQ05g19psHzRDr2bB9zHBy9kWzXN0MYrhRvuY
 HuQ3CNDbXCJPXFjrNEmLV0f7Y8yCk/YNrYPbo4zV+AW0l7NzesqzrwkP/qgNfySD
 pnQJ/mQ76P/h728fln0pYvO8GnFUFoKKoCR27jH4F/Mntr4hgR4FZNDhvK6BMQNc
 PpipJcXWT8AUY45Sg3TLKPEhjp37K+abiMKfyTEFQRvLT83O6WyCUx2HYpZI7l87
 /aTIsidlYkb2ZSeLeXEfBplRXwF7NQNvvRJCGQKHAN7UbTB1o2RVpC5bn6UIghb1
 eQQZObs4AUfmzHthF2OaeJy08AiFidbOVLmbpZuj2iOWbJzmjJea8YT/GdntxhMT
 swmkHJqdmZEYoSKKvDmemT6l7qtOwijg7sTQ/d2D77lk8QPPIHdUw6g1OVvMqoNE
 oESpIA4O2fshlpU=
 =/s3O
 -----END PGP SIGNATURE-----

Merge tag 'gfs2-v6.3-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2

Pull gfs2 fix from Andreas Gruenbacher:

 - Fix a NULL pointer dereference when mounting corrupted filesystems

* tag 'gfs2-v6.3-fix' of git://git.kernel.org/pub/scm/linux/kernel/git/gfs2/linux-gfs2:
  gfs2: Don't deref jdesc in evict
2023-05-10 16:48:35 -05:00
Bob Peterson
504a10d9e4 gfs2: Don't deref jdesc in evict
On corrupt gfs2 file systems the evict code can try to reference the
journal descriptor structure, jdesc, after it has been freed and set to
NULL. The sequence of events is:

init_journal()
...
fail_jindex:
   gfs2_jindex_free(sdp); <------frees journals, sets jdesc = NULL
      if (gfs2_holder_initialized(&ji_gh))
         gfs2_glock_dq_uninit(&ji_gh);
fail:
   iput(sdp->sd_jindex); <--references jdesc in evict_linked_inode
      evict()
         gfs2_evict_inode()
            evict_linked_inode()
               ret = gfs2_trans_begin(sdp, 0, sdp->sd_jdesc->jd_blocks);
<------references the now freed/zeroed sd_jdesc pointer.

The call to gfs2_trans_begin is done because the truncate_inode_pages
call can cause gfs2 events that require a transaction, such as removing
journaled data (jdata) blocks from the journal.

This patch fixes the problem by adding a check for sdp->sd_jdesc to
function gfs2_evict_inode. In theory, this should only happen to corrupt
gfs2 file systems, when gfs2 detects the problem, reports it, then tries
to evict all the system inodes it has read in up to that point.

Reported-by: Yang Lan <lanyang0908@gmail.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
2023-05-10 17:15:18 +02:00
Linus Torvalds
ad2fd53a78 platform-drivers-x86 for v6.4-2
Highlights:
  -  thinkpad_acpi: Fix profile (performance/bal/low-power) regression on T490
  -  misc. other small fixes / hw-id additions
 
 The following is an automated git shortlog grouped by driver:
 
 hp-wmi:
  -  add micmute to hp_wmi_keymap struct
 
 intel_scu_pcidrv:
  -  Add back PCI ID for Medfield
 
 platform/mellanox:
  -  fix potential race in mlxbf-tmfifo driver
 
 platform/x86/intel-uncore-freq:
  -  Return error on write frequency
 
 thinkpad_acpi:
  -  Add profile force ability
  -  Fix platform profiles on T490
 
 touchscreen_dmi:
  -  Add info for the Dexp Ursus KX210i
  -  Add upside-down quirk for GDIX1002 ts on the Juno Tablet
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmRbcQ4UHGhkZWdvZWRl
 QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9yFnQf9G83wo2B5SfjN7V7viYHEGB4InfQL
 +MbhwP+Efe++BkxwAzVMcVLSuYOsbqcrRBCiQkBVCyAF14LVVH3ymnn/upoqz9l4
 K4JUZorGIEPgl0SpGP+41j+P93XYfcfMYPRetZA8ErBpitSfHVf9oAG8vTgaemda
 Qxo8DWs4R1WGdg8KZ7icZ7shXiyzIeGBnQB/5EzpBrv6v6EmiJsUZh5ZZASOidrg
 IXTp8LgLhkFbIQBgHoOyHoM7puRI7MRbxu5YEJ+wKumFi4KmuTET5iDlZFpmqvoH
 eAU/JUWIsQ3I1QwTKFSLFYkJZCZrIIJHew1Lvs9F+/x1c2vs1QjOG8RoEg==
 =SEuW
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86

Pull x86 platform driver fixes from Hans de Goede:
 "Nothing special to report just various small fixes:

   - thinkpad_acpi: Fix profile (performance/bal/low-power) regression
     on T490

   - misc other small fixes / hw-id additions"

* tag 'platform-drivers-x86-v6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
  platform/mellanox: fix potential race in mlxbf-tmfifo driver
  platform/x86: touchscreen_dmi: Add info for the Dexp Ursus KX210i
  platform/x86: touchscreen_dmi: Add upside-down quirk for GDIX1002 ts on the Juno Tablet
  platform/x86: thinkpad_acpi: Add profile force ability
  platform/x86: thinkpad_acpi: Fix platform profiles on T490
  platform/x86: hp-wmi: add micmute to hp_wmi_keymap struct
  platform/x86/intel-uncore-freq: Return error on write frequency
  platform/x86: intel_scu_pcidrv: Add back PCI ID for Medfield
2023-05-10 09:36:42 -05:00
Colin Foster
cdc2e28e21 net: mscc: ocelot: fix stat counter register values
Commit d4c3676507 ("net: mscc: ocelot: keep ocelot_stat_layout by reg
address, not offset") organized the stats counters for Ocelot chips, namely
the VSC7512 and VSC7514. A few of the counter offsets were incorrect, and
were caught by this warning:

WARNING: CPU: 0 PID: 24 at drivers/net/ethernet/mscc/ocelot_stats.c:909
ocelot_stats_init+0x1fc/0x2d8
reg 0x5000078 had address 0x220 but reg 0x5000079 has address 0x214,
bulking broken!

Fix these register offsets.

Fixes: d4c3676507 ("net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset")
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 12:11:18 +01:00
t.feng
90cbed5247 ipvlan:Fix out-of-bounds caused by unclear skb->cb
If skb enqueue the qdisc, fq_skb_cb(skb)->time_to_send is changed which
is actually skb->cb, and IPCB(skb_in)->opt will be used in
__ip_options_echo. It is possible that memcpy is out of bounds and lead
to stack overflow.
We should clear skb->cb before ip_local_out or ip6_local_out.

v2:
1. clean the stack info
2. use IPCB/IP6CB instead of skb->cb

crash on stable-5.10(reproduce in kasan kernel).
Stack info:
[ 2203.651571] BUG: KASAN: stack-out-of-bounds in
__ip_options_echo+0x589/0x800
[ 2203.653327] Write of size 4 at addr ffff88811a388f27 by task
swapper/3/0
[ 2203.655460] CPU: 3 PID: 0 Comm: swapper/3 Kdump: loaded Not tainted
5.10.0-60.18.0.50.h856.kasan.eulerosv2r11.x86_64 #1
[ 2203.655466] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS rel-1.10.2-0-g5f4c7b1-20181220_000000-szxrtosci10000 04/01/2014
[ 2203.655475] Call Trace:
[ 2203.655481]  <IRQ>
[ 2203.655501]  dump_stack+0x9c/0xd3
[ 2203.655514]  print_address_description.constprop.0+0x19/0x170
[ 2203.655530]  __kasan_report.cold+0x6c/0x84
[ 2203.655586]  kasan_report+0x3a/0x50
[ 2203.655594]  check_memory_region+0xfd/0x1f0
[ 2203.655601]  memcpy+0x39/0x60
[ 2203.655608]  __ip_options_echo+0x589/0x800
[ 2203.655654]  __icmp_send+0x59a/0x960
[ 2203.655755]  nf_send_unreach+0x129/0x3d0 [nf_reject_ipv4]
[ 2203.655763]  reject_tg+0x77/0x1bf [ipt_REJECT]
[ 2203.655772]  ipt_do_table+0x691/0xa40 [ip_tables]
[ 2203.655821]  nf_hook_slow+0x69/0x100
[ 2203.655828]  __ip_local_out+0x21e/0x2b0
[ 2203.655857]  ip_local_out+0x28/0x90
[ 2203.655868]  ipvlan_process_v4_outbound+0x21e/0x260 [ipvlan]
[ 2203.655931]  ipvlan_xmit_mode_l3+0x3bd/0x400 [ipvlan]
[ 2203.655967]  ipvlan_queue_xmit+0xb3/0x190 [ipvlan]
[ 2203.655977]  ipvlan_start_xmit+0x2e/0xb0 [ipvlan]
[ 2203.655984]  xmit_one.constprop.0+0xe1/0x280
[ 2203.655992]  dev_hard_start_xmit+0x62/0x100
[ 2203.656000]  sch_direct_xmit+0x215/0x640
[ 2203.656028]  __qdisc_run+0x153/0x1f0
[ 2203.656069]  __dev_queue_xmit+0x77f/0x1030
[ 2203.656173]  ip_finish_output2+0x59b/0xc20
[ 2203.656244]  __ip_finish_output.part.0+0x318/0x3d0
[ 2203.656312]  ip_finish_output+0x168/0x190
[ 2203.656320]  ip_output+0x12d/0x220
[ 2203.656357]  __ip_queue_xmit+0x392/0x880
[ 2203.656380]  __tcp_transmit_skb+0x1088/0x11c0
[ 2203.656436]  __tcp_retransmit_skb+0x475/0xa30
[ 2203.656505]  tcp_retransmit_skb+0x2d/0x190
[ 2203.656512]  tcp_retransmit_timer+0x3af/0x9a0
[ 2203.656519]  tcp_write_timer_handler+0x3ba/0x510
[ 2203.656529]  tcp_write_timer+0x55/0x180
[ 2203.656542]  call_timer_fn+0x3f/0x1d0
[ 2203.656555]  expire_timers+0x160/0x200
[ 2203.656562]  run_timer_softirq+0x1f4/0x480
[ 2203.656606]  __do_softirq+0xfd/0x402
[ 2203.656613]  asm_call_irq_on_stack+0x12/0x20
[ 2203.656617]  </IRQ>
[ 2203.656623]  do_softirq_own_stack+0x37/0x50
[ 2203.656631]  irq_exit_rcu+0x134/0x1a0
[ 2203.656639]  sysvec_apic_timer_interrupt+0x36/0x80
[ 2203.656646]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[ 2203.656654] RIP: 0010:default_idle+0x13/0x20
[ 2203.656663] Code: 89 f0 5d 41 5c 41 5d 41 5e c3 cc cc cc cc cc cc cc
cc cc cc cc cc cc 0f 1f 44 00 00 0f 1f 44 00 00 0f 00 2d 9f 32 57 00 fb
f4 <c3> cc cc cc cc 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 41 54 be 08
[ 2203.656668] RSP: 0018:ffff88810036fe78 EFLAGS: 00000256
[ 2203.656676] RAX: ffffffffaf2a87f0 RBX: ffff888100360000 RCX:
ffffffffaf290191
[ 2203.656681] RDX: 0000000000098b5e RSI: 0000000000000004 RDI:
ffff88811a3c4f60
[ 2203.656686] RBP: 0000000000000000 R08: 0000000000000001 R09:
ffff88811a3c4f63
[ 2203.656690] R10: ffffed10234789ec R11: 0000000000000001 R12:
0000000000000003
[ 2203.656695] R13: ffff888100360000 R14: 0000000000000000 R15:
0000000000000000
[ 2203.656729]  default_idle_call+0x5a/0x150
[ 2203.656735]  cpuidle_idle_call+0x1c6/0x220
[ 2203.656780]  do_idle+0xab/0x100
[ 2203.656786]  cpu_startup_entry+0x19/0x20
[ 2203.656793]  secondary_startup_64_no_verify+0xc2/0xcb

[ 2203.657409] The buggy address belongs to the page:
[ 2203.658648] page:0000000027a9842f refcount:1 mapcount:0
mapping:0000000000000000 index:0x0 pfn:0x11a388
[ 2203.658665] flags:
0x17ffffc0001000(reserved|node=0|zone=2|lastcpupid=0x1fffff)
[ 2203.658675] raw: 0017ffffc0001000 ffffea000468e208 ffffea000468e208
0000000000000000
[ 2203.658682] raw: 0000000000000000 0000000000000000 00000001ffffffff
0000000000000000
[ 2203.658686] page dumped because: kasan: bad access detected

To reproduce(ipvlan with IPVLAN_MODE_L3):
Env setting:
=======================================================
modprobe ipvlan ipvlan_default_mode=1
sysctl net.ipv4.conf.eth0.forwarding=1
iptables -t nat -A POSTROUTING -s 20.0.0.0/255.255.255.0 -o eth0 -j
MASQUERADE
ip link add gw link eth0 type ipvlan
ip -4 addr add 20.0.0.254/24 dev gw
ip netns add net1
ip link add ipv1 link eth0 type ipvlan
ip link set ipv1 netns net1
ip netns exec net1 ip link set ipv1 up
ip netns exec net1 ip -4 addr add 20.0.0.4/24 dev ipv1
ip netns exec net1 route add default gw 20.0.0.254
ip netns exec net1 tc qdisc add dev ipv1 root netem loss 10%
ifconfig gw up
iptables -t filter -A OUTPUT -p tcp --dport 8888 -j REJECT --reject-with
icmp-port-unreachable
=======================================================
And then excute the shell(curl any address of eth0 can reach):

for((i=1;i<=100000;i++))
do
        ip netns exec net1 curl x.x.x.x:8888
done
=======================================================

Fixes: 2ad7bf3638 ("ipvlan: Initial check-in of the IPVLAN driver.")
Signed-off-by: "t.feng" <fengtao40@huawei.com>
Suggested-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 10:33:06 +01:00
Randy Dunlap
77c964dad9 docs: networking: fix x25-iface.rst heading & index order
Fix the chapter heading for "X.25 Device Driver Interface" so that it
does not contain a trailing '-' character, which makes Sphinx
omit this heading from the contents.

Reverse the order of the x25.rst and x25-iface.rst files in the index
so that the project introduction (x25.rst) comes first.

Fixes: 883780af72 ("docs: networking: convert x25-iface.txt to ReST")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: Martin Schiller <ms@dev.tdt.de>
Cc: linux-x25@vger.kernel.org
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 10:31:46 +01:00
Ziwei Xiao
f4c2e67c17 gve: Remove the code of clearing PBA bit
Clearing the PBA bit from the driver is race prone and it may lead to
dropped interrupt events. This could potentially lead to the traffic
being completely halted.

Fixes: 5e8c5adf95 ("gve: DQO: Add core netdev features")
Signed-off-by: Ziwei Xiao <ziweixiao@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 10:30:46 +01:00
Eric Dumazet
e14cadfd80 tcp: add annotations around sk->sk_shutdown accesses
Now sk->sk_shutdown is no longer a bitfield, we can add
standard READ_ONCE()/WRITE_ONCE() annotations to silence
KCSAN reports like the following:

BUG: KCSAN: data-race in tcp_disconnect / tcp_poll

write to 0xffff88814588582c of 1 bytes by task 3404 on cpu 1:
tcp_disconnect+0x4d6/0xdb0 net/ipv4/tcp.c:3121
__inet_stream_connect+0x5dd/0x6e0 net/ipv4/af_inet.c:715
inet_stream_connect+0x48/0x70 net/ipv4/af_inet.c:727
__sys_connect_file net/socket.c:2001 [inline]
__sys_connect+0x19b/0x1b0 net/socket.c:2018
__do_sys_connect net/socket.c:2028 [inline]
__se_sys_connect net/socket.c:2025 [inline]
__x64_sys_connect+0x41/0x50 net/socket.c:2025
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff88814588582c of 1 bytes by task 3374 on cpu 0:
tcp_poll+0x2e6/0x7d0 net/ipv4/tcp.c:562
sock_poll+0x253/0x270 net/socket.c:1383
vfs_poll include/linux/poll.h:88 [inline]
io_poll_check_events io_uring/poll.c:281 [inline]
io_poll_task_func+0x15a/0x820 io_uring/poll.c:333
handle_tw_list io_uring/io_uring.c:1184 [inline]
tctx_task_work+0x1fe/0x4d0 io_uring/io_uring.c:1246
task_work_run+0x123/0x160 kernel/task_work.c:179
get_signal+0xe64/0xff0 kernel/signal.c:2635
arch_do_signal_or_restart+0x89/0x2a0 arch/x86/kernel/signal.c:306
exit_to_user_mode_loop+0x6f/0xe0 kernel/entry/common.c:168
exit_to_user_mode_prepare+0x6c/0xb0 kernel/entry/common.c:204
__syscall_exit_to_user_mode_work kernel/entry/common.c:286 [inline]
syscall_exit_to_user_mode+0x26/0x140 kernel/entry/common.c:297
do_syscall_64+0x4d/0xc0 arch/x86/entry/common.c:86
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x03 -> 0x00

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 10:27:31 +01:00
Eric Dumazet
4063384ef7 net: add vlan_get_protocol_and_depth() helper
Before blamed commit, pskb_may_pull() was used instead
of skb_header_pointer() in __vlan_get_protocol() and friends.

Few callers depended on skb->head being populated with MAC header,
syzbot caught one of them (skb_mac_gso_segment())

Add vlan_get_protocol_and_depth() to make the intent clearer
and use it where sensible.

This is a more generic fix than commit e9d3f80935
("net/af_packet: make sure to pull mac header") which was
dealing with a similar issue.

kernel BUG at include/linux/skbuff.h:2655 !
invalid opcode: 0000 [#1] SMP KASAN
CPU: 0 PID: 1441 Comm: syz-executor199 Not tainted 6.1.24-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
RIP: 0010:__skb_pull include/linux/skbuff.h:2655 [inline]
RIP: 0010:skb_mac_gso_segment+0x68f/0x6a0 net/core/gro.c:136
Code: fd 48 8b 5c 24 10 44 89 6b 70 48 c7 c7 c0 ae 0d 86 44 89 e6 e8 a1 91 d0 00 48 c7 c7 00 af 0d 86 48 89 de 31 d2 e8 d1 4a e9 ff <0f> 0b 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 41
RSP: 0018:ffffc90001bd7520 EFLAGS: 00010286
RAX: ffffffff8469736a RBX: ffff88810f31dac0 RCX: ffff888115a18b00
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffffc90001bd75e8 R08: ffffffff84697183 R09: fffff5200037adf9
R10: 0000000000000000 R11: dffffc0000000001 R12: 0000000000000012
R13: 000000000000fee5 R14: 0000000000005865 R15: 000000000000fed7
FS: 000055555633f300(0000) GS:ffff8881f6a00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020000000 CR3: 0000000116fea000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
[<ffffffff847018dd>] __skb_gso_segment+0x32d/0x4c0 net/core/dev.c:3419
[<ffffffff8470398a>] skb_gso_segment include/linux/netdevice.h:4819 [inline]
[<ffffffff8470398a>] validate_xmit_skb+0x3aa/0xee0 net/core/dev.c:3725
[<ffffffff84707042>] __dev_queue_xmit+0x1332/0x3300 net/core/dev.c:4313
[<ffffffff851a9ec7>] dev_queue_xmit+0x17/0x20 include/linux/netdevice.h:3029
[<ffffffff851b4a82>] packet_snd net/packet/af_packet.c:3111 [inline]
[<ffffffff851b4a82>] packet_sendmsg+0x49d2/0x6470 net/packet/af_packet.c:3142
[<ffffffff84669a12>] sock_sendmsg_nosec net/socket.c:716 [inline]
[<ffffffff84669a12>] sock_sendmsg net/socket.c:736 [inline]
[<ffffffff84669a12>] __sys_sendto+0x472/0x5f0 net/socket.c:2139
[<ffffffff84669c75>] __do_sys_sendto net/socket.c:2151 [inline]
[<ffffffff84669c75>] __se_sys_sendto net/socket.c:2147 [inline]
[<ffffffff84669c75>] __x64_sys_sendto+0xe5/0x100 net/socket.c:2147
[<ffffffff8551d40f>] do_syscall_x64 arch/x86/entry/common.c:50 [inline]
[<ffffffff8551d40f>] do_syscall_64+0x2f/0x50 arch/x86/entry/common.c:80
[<ffffffff85600087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd

Fixes: 469aceddfa ("vlan: consolidate VLAN parsing code and limit max parsing depth")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Toke Høiland-Jørgensen <toke@redhat.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 10:25:55 +01:00
Russell King (Oracle)
43fb622d91 net: pcs: xpcs: fix incorrect number of interfaces
In synopsys_xpcs_compat[], the DW_XPCS_2500BASEX entry was setting
the number of interfaces using the xpcs_2500basex_features array
rather than xpcs_2500basex_interfaces. This causes us to overflow
the array of interfaces. Fix this.

Fixes: f27abde304 ("net: pcs: add 2500BASEX support for Intel mGbE controller")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 10:25:02 +01:00
Eric Dumazet
d0ac89f6f9 net: deal with most data-races in sk_wait_event()
__condition is evaluated twice in sk_wait_event() macro.

First invocation is lockless, and reads can race with writes,
as spotted by syzbot.

BUG: KCSAN: data-race in sk_stream_wait_connect / tcp_disconnect

write to 0xffff88812d83d6a0 of 4 bytes by task 9065 on cpu 1:
tcp_disconnect+0x2cd/0xdb0
inet_shutdown+0x19e/0x1f0 net/ipv4/af_inet.c:911
__sys_shutdown_sock net/socket.c:2343 [inline]
__sys_shutdown net/socket.c:2355 [inline]
__do_sys_shutdown net/socket.c:2363 [inline]
__se_sys_shutdown+0xf8/0x140 net/socket.c:2361
__x64_sys_shutdown+0x31/0x40 net/socket.c:2361
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff88812d83d6a0 of 4 bytes by task 9040 on cpu 0:
sk_stream_wait_connect+0x1de/0x3a0 net/core/stream.c:75
tcp_sendmsg_locked+0x2e4/0x2120 net/ipv4/tcp.c:1266
tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1484
inet6_sendmsg+0x63/0x80 net/ipv6/af_inet6.c:651
sock_sendmsg_nosec net/socket.c:724 [inline]
sock_sendmsg net/socket.c:747 [inline]
__sys_sendto+0x246/0x300 net/socket.c:2142
__do_sys_sendto net/socket.c:2154 [inline]
__se_sys_sendto net/socket.c:2150 [inline]
__x64_sys_sendto+0x78/0x90 net/socket.c:2150
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x00000000 -> 0x00000068

Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 10:03:32 +01:00
Eric Dumazet
e05a5f510f net: annotate sk->sk_err write from do_recvmmsg()
do_recvmmsg() can write to sk->sk_err from multiple threads.

As said before, many other points reading or writing sk_err
need annotations.

Fixes: 34b88a68f2 ("net: Fix use after free in the recvmmsg exit path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 09:58:29 +01:00
Eric Dumazet
a939d14919 netlink: annotate accesses to nlk->cb_running
Both netlink_recvmsg() and netlink_native_seq_show() read
nlk->cb_running locklessly. Use READ_ONCE() there.

Add corresponding WRITE_ONCE() to netlink_dump() and
__netlink_dump_start()

syzbot reported:
BUG: KCSAN: data-race in __netlink_dump_start / netlink_recvmsg

write to 0xffff88813ea4db59 of 1 bytes by task 28219 on cpu 0:
__netlink_dump_start+0x3af/0x4d0 net/netlink/af_netlink.c:2399
netlink_dump_start include/linux/netlink.h:308 [inline]
rtnetlink_rcv_msg+0x70f/0x8c0 net/core/rtnetlink.c:6130
netlink_rcv_skb+0x126/0x220 net/netlink/af_netlink.c:2577
rtnetlink_rcv+0x1c/0x20 net/core/rtnetlink.c:6192
netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
netlink_unicast+0x56f/0x640 net/netlink/af_netlink.c:1365
netlink_sendmsg+0x665/0x770 net/netlink/af_netlink.c:1942
sock_sendmsg_nosec net/socket.c:724 [inline]
sock_sendmsg net/socket.c:747 [inline]
sock_write_iter+0x1aa/0x230 net/socket.c:1138
call_write_iter include/linux/fs.h:1851 [inline]
new_sync_write fs/read_write.c:491 [inline]
vfs_write+0x463/0x760 fs/read_write.c:584
ksys_write+0xeb/0x1a0 fs/read_write.c:637
__do_sys_write fs/read_write.c:649 [inline]
__se_sys_write fs/read_write.c:646 [inline]
__x64_sys_write+0x42/0x50 fs/read_write.c:646
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

read to 0xffff88813ea4db59 of 1 bytes by task 28222 on cpu 1:
netlink_recvmsg+0x3b4/0x730 net/netlink/af_netlink.c:2022
sock_recvmsg_nosec+0x4c/0x80 net/socket.c:1017
____sys_recvmsg+0x2db/0x310 net/socket.c:2718
___sys_recvmsg net/socket.c:2762 [inline]
do_recvmmsg+0x2e5/0x710 net/socket.c:2856
__sys_recvmmsg net/socket.c:2935 [inline]
__do_sys_recvmmsg net/socket.c:2958 [inline]
__se_sys_recvmmsg net/socket.c:2951 [inline]
__x64_sys_recvmmsg+0xe2/0x160 net/socket.c:2951
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

value changed: 0x00 -> 0x01

Fixes: 16b304f340 ("netlink: Eliminate kmalloc in netlink dump operation.")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 09:28:38 +01:00
David S. Miller
a5b3363d8c Merge branch 'bonding-overflow'
Hangbin Liu says:

====================
bonding: fix send_peer_notif overflow

Bonding send_peer_notif was defined as u8. But the value is
num_peer_notif multiplied by peer_notif_delay, which is u8 * u32.
This would cause the send_peer_notif overflow.

Before the fix:
TEST: num_grat_arp (active-backup miimon num_grat_arp 10)           [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 20)           [ OK ]
4 garp packets sent on active slave eth1
TEST: num_grat_arp (active-backup miimon num_grat_arp 30)           [FAIL]
24 garp packets sent on active slave eth1
TEST: num_grat_arp (active-backup miimon num_grat_arp 50)           [FAIL]

After the fix:
TEST: num_grat_arp (active-backup miimon num_grat_arp 10)           [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 20)           [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 30)           [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 50)           [ OK ]
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 09:27:21 +01:00
Hangbin Liu
6cbe791c0f kselftest: bonding: add num_grat_arp test
TEST: num_grat_arp (active-backup miimon num_grat_arp 10)           [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 20)           [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 30)           [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 50)           [ OK ]

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 09:27:20 +01:00
Hangbin Liu
b6d1599f8c selftests: forwarding: lib: add netns support for tc rule handle stats get
When run the test in netns, it's not easy to get the tc stats via
tc_rule_handle_stats_get(). With the new netns parameter, we can get
stats from specific netns like

  num=$(tc_rule_handle_stats_get "dev eth0 ingress" 101 ".packets" "-n ns")

Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 09:27:20 +01:00
Hangbin Liu
84df83e0ec Documentation: bonding: fix the doc of peer_notif_delay
Bonding only supports setting peer_notif_delay with miimon set.

Fixes: 0307d589c4 ("bonding: add documentation for peer_notif_delay")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 09:27:20 +01:00
Hangbin Liu
9949e2efb5 bonding: fix send_peer_notif overflow
Bonding send_peer_notif was defined as u8. Since commit 07a4ddec3c
("bonding: add an option to specify a delay between peer notifications").
the bond->send_peer_notif will be num_peer_notif multiplied by
peer_notif_delay, which is u8 * u32. This would cause the send_peer_notif
overflow easily. e.g.

  ip link add bond0 type bond mode 1 miimon 100 num_grat_arp 30 peer_notify_delay 1000

To fix the overflow, let's set the send_peer_notif to u32 and limit
peer_notif_delay to 300s.

Reported-by: Liang Li <liali@redhat.com>
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2090053
Fixes: 07a4ddec3c ("bonding: add an option to specify a delay between peer notifications")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 09:27:20 +01:00
Daniel Golle
7c83e28f10 net: ethernet: mtk_eth_soc: fix NULL pointer dereference
Check for NULL pointer to avoid kernel crashing in case of missing WO
firmware in case only a single WEDv2 device has been initialized, e.g. on
MT7981 which can connect just one wireless frontend.

Fixes: 86ce0d09e4 ("net: ethernet: mtk_eth_soc: use WO firmware for MT7981")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2023-05-10 09:23:53 +01:00
Florian Westphal
3acf8f6c14 selftests: nft_flowtable.sh: check ingress/egress chain too
Make sure flowtable interacts correctly with ingress and egress
chains, i.e. those get handled before and after flow table respectively.

Adds three more tests:
1. repeat flowtable test, but with 'ip dscp set cs3' done in
   inet forward chain.

Expect that some packets have been mangled (before flowtable offload
became effective) while some pass without mangling (after offload
succeeds).

2. repeat flowtable test, but with 'ip dscp set cs3' done in
   veth0:ingress.

Expect that all packets pass with cs3 dscp field.

3. same as 2, but use veth1:egress.  Expect the same outcome.

Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10 09:31:07 +02:00
Boris Sukholitko
90ab51226d selftests: nft_flowtable.sh: monitor result file sizes
When running nft_flowtable.sh in VM on a busy server we've found that
the time of the netcat file transfers vary wildly.

Therefore replace hardcoded 3 second sleep with the loop checking for
a change in the file sizes. Once no change in detected we test the results.

Nice side effect is that we shave 1 second sleep in the fast case
(hard-coded 3 second sleep vs two 1 second sleeps).

Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10 09:31:07 +02:00
Boris Sukholitko
1114803c2d selftests: nft_flowtable.sh: wait for specific nc pids
Doing wait with no parameters may interfere with some of the tests
having their own background processes.

Although no such test is currently present, the cleanup is useful
to rely on the nft_flowtable.sh for local development (e.g. running
background tcpdump command during the tests).

Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10 09:31:06 +02:00
Boris Sukholitko
0749d670d7 selftests: nft_flowtable.sh: no need for ps -x option
Some ps commands (e.g. busybox derived) have no -x option. For the
purposes of hash calculation of the list of processes this option is
inessential.

Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10 09:31:05 +02:00
Boris Sukholitko
0a11073e8e selftests: nft_flowtable.sh: use /proc for pid checking
Some ps commands (e.g. busybox derived) have no -p option. Use /proc for
pid existence check.

Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10 09:30:52 +02:00
Florian Westphal
e72eeab542 netfilter: conntrack: fix possible bug_on with enable_hooks=1
I received a bug report (no reproducer so far) where we trip over

712         rcu_read_lock();
713         ct_hook = rcu_dereference(nf_ct_hook);
714         BUG_ON(ct_hook == NULL);  // here

In nf_conntrack_destroy().

First turn this BUG_ON into a WARN.  I think it was triggered
via enable_hooks=1 flag.

When this flag is turned on, the conntrack hooks are registered
before nf_ct_hook pointer gets assigned.
This opens a short window where packets enter the conntrack machinery,
can have skb->_nfct set up and a subsequent kfree_skb might occur
before nf_ct_hook is set.

Call nf_conntrack_init_end() to set nf_ct_hook before we register the
pernet ops.

Fixes: ba3fbe6636 ("netfilter: nf_conntrack: provide modparam to always register conntrack hooks")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10 08:50:39 +02:00
Florian Westphal
dc1c9fd4a8 netfilter: nf_tables: always release netdev hooks from notifier
This reverts "netfilter: nf_tables: skip netdev events generated on netns removal".

The problem is that when a veth device is released, the veth release
callback will also queue the peer netns device for removal.

Its possible that the peer netns is also slated for removal.  In this
case, the device memory is already released before the pre_exit hook of
the peer netns runs:

BUG: KASAN: slab-use-after-free in nf_hook_entry_head+0x1b8/0x1d0
Read of size 8 at addr ffff88812c0124f0 by task kworker/u8:1/45
Workqueue: netns cleanup_net
Call Trace:
 nf_hook_entry_head+0x1b8/0x1d0
 __nf_unregister_net_hook+0x76/0x510
 nft_netdev_unregister_hooks+0xa0/0x220
 __nft_release_hook+0x184/0x490
 nf_tables_pre_exit_net+0x12f/0x1b0
 ..

Order is:
1. First netns is released, veth_dellink() queues peer netns device
   for removal
2. peer netns is queued for removal
3. peer netns device is released, unreg event is triggered
4. unreg event is ignored because netns is going down
5. pre_exit hook calls nft_netdev_unregister_hooks but device memory
   might be free'd already.

Fixes: 68a3765c65 ("netfilter: nf_tables: skip netdev events generated on netns removal")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-05-10 08:50:18 +02:00
Florian Fainelli
582dbb2cc1 net: phy: bcm7xx: Correct read from expansion register
Since the driver works in the "legacy" addressing mode, we need to write
to the expansion register (0x17) with bits 11:8 set to 0xf to properly
select the expansion register passed as argument.

Fixes: f68d08c437 ("net: phy: bcm7xxx: Add EPHY entry for 72165")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230508231749.1681169-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09 20:25:52 -07:00
Kuniyuki Iwashima
dfd9248c07 net: Fix load-tearing on sk->sk_stamp in sock_recv_cmsgs().
KCSAN found a data race in sock_recv_cmsgs() where the read access
to sk->sk_stamp needs READ_ONCE().

BUG: KCSAN: data-race in packet_recvmsg / packet_recvmsg

write (marked) to 0xffff88803c81f258 of 8 bytes by task 19171 on cpu 0:
 sock_write_timestamp include/net/sock.h:2670 [inline]
 sock_recv_cmsgs include/net/sock.h:2722 [inline]
 packet_recvmsg+0xb97/0xd00 net/packet/af_packet.c:3489
 sock_recvmsg_nosec net/socket.c:1019 [inline]
 sock_recvmsg+0x11a/0x130 net/socket.c:1040
 sock_read_iter+0x176/0x220 net/socket.c:1118
 call_read_iter include/linux/fs.h:1845 [inline]
 new_sync_read fs/read_write.c:389 [inline]
 vfs_read+0x5e0/0x630 fs/read_write.c:470
 ksys_read+0x163/0x1a0 fs/read_write.c:613
 __do_sys_read fs/read_write.c:623 [inline]
 __se_sys_read fs/read_write.c:621 [inline]
 __x64_sys_read+0x41/0x50 fs/read_write.c:621
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

read to 0xffff88803c81f258 of 8 bytes by task 19183 on cpu 1:
 sock_recv_cmsgs include/net/sock.h:2721 [inline]
 packet_recvmsg+0xb64/0xd00 net/packet/af_packet.c:3489
 sock_recvmsg_nosec net/socket.c:1019 [inline]
 sock_recvmsg+0x11a/0x130 net/socket.c:1040
 sock_read_iter+0x176/0x220 net/socket.c:1118
 call_read_iter include/linux/fs.h:1845 [inline]
 new_sync_read fs/read_write.c:389 [inline]
 vfs_read+0x5e0/0x630 fs/read_write.c:470
 ksys_read+0x163/0x1a0 fs/read_write.c:613
 __do_sys_read fs/read_write.c:623 [inline]
 __se_sys_read fs/read_write.c:621 [inline]
 __x64_sys_read+0x41/0x50 fs/read_write.c:621
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x72/0xdc

value changed: 0xffffffffc4653600 -> 0x0000000000000000

Reported by Kernel Concurrency Sanitizer on:
CPU: 1 PID: 19183 Comm: syz-executor.5 Not tainted 6.3.0-rc7-02330-gca6270c12e20 #2
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014

Fixes: 6c7c98bad4 ("sock: avoid dirtying sk_stamp, if possible")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20230508175543.55756-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09 20:09:49 -07:00
Marek Vasut
8efbdbfa99 net: stmmac: Initialize MAC_ONEUS_TIC_COUNTER register
Initialize MAC_ONEUS_TIC_COUNTER register with correct value derived
from CSR clock, otherwise EEE is unstable on at least NXP i.MX8M Plus
and Micrel KSZ9131RNX PHY, to the point where not even ARP request can
be sent out.

i.MX 8M Plus Applications Processor Reference Manual, Rev. 1, 06/2021
11.7.6.1.34 One-microsecond Reference Timer (MAC_ONEUS_TIC_COUNTER)
defines this register as:
"
This register controls the generation of the Reference time (1 microsecond
tic) for all the LPI timers. This timer has to be programmed by the software
initially.
...
The application must program this counter so that the number of clock cycles
of CSR clock is 1us. (Subtract 1 from the value before programming).
For example if the CSR clock is 100MHz then this field needs to be programmed
to value 100 - 1 = 99 (which is 0x63).
This is required to generate the 1US events that are used to update some of
the EEE related counters.
"

The reset value is 0x63 on i.MX8M Plus, which means expected CSR clock are
100 MHz. However, the i.MX8M Plus "enet_qos_root_clk" are 266 MHz instead,
which means the LPI timers reach their count much sooner on this platform.

This is visible using a scope by monitoring e.g. exit from LPI mode on TX_CTL
line from MAC to PHY. This should take 30us per STMMAC_DEFAULT_TWT_LS setting,
during which the TX_CTL line transitions from tristate to low, and 30 us later
from low to high. On i.MX8M Plus, this transition takes 11 us, which matches
the 30us * 100/266 formula for misconfigured MAC_ONEUS_TIC_COUNTER register.

Configure MAC_ONEUS_TIC_COUNTER based on CSR clock, so that the LPI timers
have correct 1us reference. This then fixes EEE on i.MX8M Plus with Micrel
KSZ9131RNX PHY.

Fixes: 477286b53f ("stmmac: add GMAC4 core support")
Signed-off-by: Marek Vasut <marex@denx.de>
Tested-by: Harald Seiler <hws@denx.de>
Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Verdin iMX8MP
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20230506235845.246105-1-marex@denx.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2023-05-09 19:42:59 -07:00
Linus Torvalds
16a8829130 nfs: fix another case of NULL/IS_ERR confusion wrt folio pointers
Dan has been improving on the smatch error pointer checks, and pointed
at another case where the __filemap_get_folio() conversion to error
pointers had been overlooked.  This time because it was hidden behind
the filemap_grab_folio() helper function that is a wrapper around it.

Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Anna Schumaker <anna@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2023-05-09 10:22:13 -07:00
Linus Torvalds
1dc3731daf for-6.4-rc1-tag
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmRaYhYACgkQxWXV+ddt
 WDvCRQ/+MjRuInALh+N34mMneF8jjPOlUQBZbaC43XYJ0ss9drSzvE3STmPVrjdK
 IHyzRYipKI6vdTtzYbyGwxJ9oazsuXTQXC3w/qMW1hO1EAQ0a9tbnTSIQ+BDbU63
 BW7rJ3JuM6hKxKK+e9Dserhks0lOgQc+xKT1CUELvAHp3UykD4OrNczguaIT2lGR
 YXL+9B3ex2SooCqrQStkqEtjD/kxbaYUkK7yWA2FssXWqU5SjZwUOsuY3ZPOWrm1
 ULNI67gIxkMkSynV3aYka7nY3xc9oGIfk9WPeylWcOcH3+pWabeptjk617XbA0KI
 4biz1zZ/qTRXWlCLDv3ukUa5EIVAWQ1kxVE/hAt3SzqJvoqB/ymML/2LeQNdyx2i
 adMTZQ95JkhQNU9Lp9QOtpgfZonhhjxnL9KE7eMVo28zJFdYjge3egINjimY+mLz
 qzrzUBI3bqCNYG0LRR1EvuN0feBd/9nNMFjLBi2mkDqsWtzvTxxzWvVlV5EEcoJe
 xrozGh00Y5ioP6ZanKuZRib+u2ligbD66dYhKSU74D6B5kuZPic3Kkn9qICjRByM
 uBGBze/7GT/3ouhPOwxVPtGZstiFhbAxE7mApROrIxAx8I9rZjBdHgFJQklolXNy
 HSKNf3u98XZBVVcku/O1hyoeTLnApPfApxD4lv3qlRmgdEnAp6I=
 =K25T
 -----END PGP SIGNATURE-----

Merge tag 'for-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux

Pull btrfs fixes from David Sterba:

 - fix backward leaf iteration which could possibly return the same key

 - fix assertion when device add and balance race for exclusive
   operation

 - fix regression when freeing device, state tree would leak after
   device replace

 - fix attempt to clear space cache v1 when block-group-tree is enabled

 - fix potential i_size corruption when encoded write races with send v2
   and enabled no-holes (the race is hard to hit though, the window is a
   few instructions wide)

 - fix wrong bitmap API use when checking empty zones, parameters were
   swapped but not causing a bug due to other code

 - prevent potential qgroup leak if subvolume create does not commit
   transaction (which is pending in the development queue)

 - error handling and reporting:
     - abort transaction when sibling keys check fails for leaves
     - print extent buffers when sibling keys check fails

* tag 'for-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
  btrfs: don't free qgroup space unless specified
  btrfs: fix encoded write i_size corruption with no-holes
  btrfs: zoned: fix wrong use of bitops API in btrfs_ensure_empty_zones
  btrfs: properly reject clear_cache and v1 cache for block-group-tree
  btrfs: print extent buffers when sibling keys check fails
  btrfs: abort transaction when sibling keys check fails for leaves
  btrfs: fix leak of source device allocation state after device replace
  btrfs: fix assertion of exclop condition when starting balance
  btrfs: fix btrfs_prev_leaf() to not return the same key twice
2023-05-09 09:53:41 -07:00
Liming Sun
3d43f9f639 platform/mellanox: fix potential race in mlxbf-tmfifo driver
This commit adds memory barrier for the 'vq' update in function
mlxbf_tmfifo_virtio_find_vqs() to avoid potential race due to
out-of-order memory write. It also adds barrier for the 'is_ready'
flag to make sure the initializations are visible before this flag
is checked.

Signed-off-by: Liming Sun <limings@nvidia.com>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/b98c0ab61d644ba38fa9b3fd1607b138b0dd820b.1682518748.git.limings@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-05-09 11:54:35 +02:00
Roy Novich
162bd18eb5 linux/dim: Do nothing if no time delta between samples
Add return value for dim_calc_stats. This is an indication for the
caller if curr_stats was assigned by the function. Avoid using
curr_stats uninitialized over {rdma/net}_dim, when no time delta between
samples. Coverity reported this potential use of an uninitialized
variable.

Fixes: 4c4dbb4a73 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux")
Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Link: https://lore.kernel.org/r/20230507135743.138993-1-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2023-05-09 11:06:45 +02:00
Andrey Avdeev
4b65f95c87 platform/x86: touchscreen_dmi: Add info for the Dexp Ursus KX210i
Add touchscreen info for the Dexp Ursus KX210i

Signed-off-by: Andrey Avdeev <jamesstoun@gmail.com>
Link: https://lore.kernel.org/r/ZE4gRgzRQCjXFYD0@avdeevavpc
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-05-09 10:54:58 +02:00
Hans de Goede
6abfa99ce5 platform/x86: touchscreen_dmi: Add upside-down quirk for GDIX1002 ts on the Juno Tablet
The Juno Computers Juno Tablet has an upside-down mounted Goodix
touchscreen. Add a quirk to invert both axis to correct for this.

Link: https://junocomputers.com/us/product/juno-tablet/
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Link: https://lore.kernel.org/r/20230505210323.43177-1-hdegoede@redhat.com
2023-05-09 10:54:50 +02:00
Mark Pearson
1684878952 platform/x86: thinkpad_acpi: Add profile force ability
There has been a lot of confusion around which platform profiles are
supported on various platforms and it would be useful to have a debug
method to be able to override the profile mode that is selected.

I don't expect this to be used in anything other than debugging in
conjunction with Lenovo engineers - but it does give a way to get a
system working whilst we wait for either FW fixes, or a driver fix
to land upstream, if something is wonky in the mode detection logic

Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/20230505132523.214338-2-mpearson-lenovo@squebb.ca
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-05-09 10:54:45 +02:00
Mark Pearson
0c0cd3e25a platform/x86: thinkpad_acpi: Fix platform profiles on T490
I had incorrectly thought that PSC profiles were not usable on Intel
platforms so had blocked them in the driver initialistion. This broke
platform profiles on the T490.

After discussion with the FW team PSC does work on Intel platforms and
should be allowed.

Note - it's possible this may impact other platforms where it is advertised
but special driver support that only Windows has is needed. But if it does
then they will need fixing via quirks. Please report any issues to me so I
can get them addressed - but I haven't found any problems in testing...yet

Fixes: bce6243f76 ("platform/x86: thinkpad_acpi: do not use PSC mode on Intel platforms")
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2177962
Cc: stable@vger.kernel.org
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/20230505132523.214338-1-mpearson-lenovo@squebb.ca
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-05-09 10:54:39 +02:00
Fae
decab2825c platform/x86: hp-wmi: add micmute to hp_wmi_keymap struct
Fixes micmute key of HP Envy X360 ey0xxx.

Signed-off-by: Fae <faenkhauser@gmail.com>
Link: https://lore.kernel.org/r/20230425063644.11828-1-faenkhauser@gmail.com
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-05-09 10:54:33 +02:00
Srinivas Pandruvada
75e406b540 platform/x86/intel-uncore-freq: Return error on write frequency
Currently when the uncore_write() returns error, it is silently
ignored. Return error to user space when uncore_write() fails.

Fixes: 49a474c7ba ("platform/x86: Add support for Uncore frequency control")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Wendy Wang <wendy.wang@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20230418153230.679094-1-srinivas.pandruvada@linux.intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
2023-05-09 10:53:58 +02:00