1312734 Commits

Author SHA1 Message Date
SkyLake.Huang
3cb1a3c9cb net: phy: mediatek: Integrate read/write page helper functions
This patch integrates read/write page helper functions as MTK phy lib.
They are basically the same in mtk-ge.c & mtk-ge-soc.c.

Signed-off-by: SkyLake.Huang <skylake.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 13:06:04 +00:00
SkyLake.Huang
477c200aa7 net: phy: mediatek: Improve readability of mtk-phy-lib.c's mtk_phy_led_hw_ctrl_set()
This patch removes parens around TRIGGER_NETDEV_RX/TRIGGER_NETDEV_TX in
mtk_phy_led_hw_ctrl_set(), which improves readability.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: SkyLake.Huang <skylake.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 13:06:04 +00:00
SkyLake.Huang
7f9c320c98 net: phy: mediatek: Move LED helper functions into mtk phy lib
This patch creates mtk-phy-lib.c & mtk-phy.h and integrates mtk-ge-soc.c's
LED helper functions so that we can use those helper functions in other
MTK's ethernet phy driver.

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: SkyLake.Huang <skylake.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 13:06:04 +00:00
SkyLake.Huang
4c452f7ea8 net: phy: mediatek: Re-organize MediaTek ethernet phy drivers
Re-organize MediaTek ethernet phy driver files and get ready to integrate
some common functions and add new 2.5G phy driver.
mtk-ge.c: MT7530 Gphy on MT7621 & MT7531 Gphy
mtk-ge-soc.c: Built-in Gphy on MT7981 & Built-in switch Gphy on MT7988
mtk-2p5ge.c: Planned for built-in 2.5G phy on MT7988

Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: SkyLake.Huang <skylake.huang@mediatek.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 13:06:04 +00:00
David S. Miller
8545b75bc4 Merge branch 'octeontx2-rvu-rep'
Geetha sowjanya says:

====================
Introduce RVU representors

This series adds representor support for each rvu devices.
When switchdev mode is enabled, representor netdev is registered
for each rvu device. In implementation of representor model,
one NIX HW LF with multiple SQ and RQ is reserved, where each
RQ and SQ of the LF are mapped to a representor. A loopback channel
is reserved to support packet path between representors and VFs.
CN10K silicon supports 2 types of MACs, RPM and SDP. This
patch set adds representor support for both RPM and SDP MAC
interfaces.

- Patch 1: Implements basic representor driver.
- Patch 2: Add devlink support to create representor netdevs that
  can be used to manage VFs.
- Patch 3: Implements basec netdev_ndo_ops.
- Patch 4: Installs tcam rules to route packets between representor and
	   VFs.
- Patch 5: Enables fetching VF stats via representor interface
- Patch 6: Adds support to sync link state between representors and VFs .
- Patch 7: Enables configuring VF MTU via representor netdevs.
- Patch 8: Adds representors for sdp MAC.
- Patch 9: Adds devlink port support.
- Patch 10: Implements offload stats.
- Patch 11: Implements tc offload support.
- patch 12: Adds documentation for rvu port representor.

pci/0002:1c:00.0

Command to create PF/VF representor

	Rpf1vf0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether f6:43:83:ee:26:21 brd ff:ff:ff:ff:ff:ff
	Rpf1vf1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 12:b2:54:0e:24:54 brd ff:ff:ff:ff:ff:ff
	Rpf1vf2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 4a:12:c4:4c:32:62 brd ff:ff:ff:ff:ff:ff
	Rpf1vf3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether ca:cb:68:0e:e2:6e brd ff:ff:ff:ff:ff:ff
	Rpf2vf0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 06:cc:ad:b4:f0:93 brd ff:ff:ff:ff:ff:ff

~# devlink port
	pci/0002:1c:00.0/0: type eth netdev Rpf1vf0 flavour physical port 0 splittable false
	pci/0002:1c:00.0/1: type eth netdev Rpf1vf1 flavour pcivf controller 0 pfnum 1 vfnum 1 external false splittable false
	pci/0002:1c:00.0/2: type eth netdev Rpf1vf2 flavour pcivf controller 0 pfnum 1 vfnum 2 external false splittable false
	pci/0002:1c:00.0/3: type eth netdev Rpf1vf3 flavour pcivf controller 0 pfnum 1 vfnum 3 external false splittable false

-----------
v11:v1:
  - Submitted refactoring changes as a separate patch set.
	https://lore.kernel.org/netdev/20241023161843.15543-1-gakula@marvell.com/T/
  - Moved documentation to a separate patch.
  - patch 9: Added code changes to forward updated mac address to VF.
  - Implemented TC offload support.

v10-v11:
  - As suggested by "Jiri Pirko" adjusted the documentation.
  - Added more commit description to patch1.

v9-v10:
  - Fixed build warning w.r.t documentation.

v8-v9:
   - Updated the documentation.

v7-v8:
   - Implemented offload stats ndo.
   - Added documentation.

v6-v7:
  - Rebased on top net-next branch.

v5-v6:
  - Addressed review comments provided by "Simon Horman".
  - Added review tag.

v4-v5:
  - Patch 3: Removed devm_* usage in rvu_rep_create()
  - Patch 3: Fixed build warnings.

v3-v4:
 - Patch 2 & 3: Fixed coccinelle reported warnings.
 - Patch 10: Added devlink port support.

v2-v3:
 - Used extack for error messages.
 - As suggested reworked commit messages.
 - Fixed sparse warning.

v1-v2:
 -Fixed build warnings.
 -Address review comments provided by "Kalesh Anakkur Purayil".
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:12 +00:00
Geetha sowjanya
6050b04dca Documentation: octeontx2: Add Documentation for RVU representors
Adds documentation for creating and configuring rvu port representors

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:12 +00:00
Geetha sowjanya
6c40ca957f octeontx2-pf: Adds TC offload support
Implements tc offload support for rvu representors.

Usage example:

 - Add tc rule to drop packets with vlan id 3 using port
   representor(Rpf1vf0).

	# tc filter add dev Rpf1vf0 protocol 802.1Q parent ffff: flower
	   vlan_id 3 vlan_ethtype ipv4 skip_sw action drop

- Redirect packets with vlan id 5 and IPv4 packets to eth1,
  after stripping vlan header.

	# tc filter add dev Rpf1vf0 ingress protocol 802.1Q flower vlan_id 5
	  vlan_ethtype ipv4 skip_sw action vlan pop action mirred ingress
	  redirect dev eth1

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:12 +00:00
Geetha sowjanya
d8dec30b51 octeontx2-pf: Implement offload stats ndo for representors
Implement the offload stat ndo by fetching the HW stats
of rx/tx queues attached to the representor.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:11 +00:00
Geetha sowjanya
9ed0343f56 octeontx2-pf: Add devlink port support
Register devlink port for the rvu representors.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:11 +00:00
Geetha sowjanya
2f7f33a095 octeontx2-pf: Add representors for sdp MAC
Hardware supports different types of MACs eg RPM, SDP, LBK.
LBK is for internal Tx->Rx HW loopback path. RPM and SDP MACs support
ingress/egress pkt IO on interfaces with different set of capabilities
like interface modes. At the time of netdev driver registration PF will
seek MAC related information from Admin function driver
'drivers/net/ethernet/marvell/octeontx2/af' and sets up ingress/egress
queues etc such that pkt IO on the channels of these different MACs is
possible. This patch add representors for SDP MAC.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:11 +00:00
Geetha sowjanya
3392f91903 octeontx2-pf: Configure VF mtu via representor
Adds support to manage the mtu configuration for VF through representor.
On update of representor mtu a mbox notification is send
to VF to update its mtu.

This feature is implemented based on the "Network Function Representors"
kernel documentation.
"
Setting an MTU on the representor should cause that same MTU
to be reported to the representee.
"

Signed-off-by: Sai Krishna <saikrishnag@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:11 +00:00
Geetha sowjanya
b8fea84a04 octeontx2-pf: Add support to sync link state between representor and VFs
Implements the below requirement mentioned
in the representors documentation.

"
The representee's link state is controlled through the
representor. Setting the representor administratively UP
or DOWN should cause carrier ON or OFF at the representee.
"

This patch enables
- Reflecting the link state of representor based on the VF state and
 link state of VF based on representor.
- On VF interface up/down a notification is sent via mbox to representor
  to update the link state.
  eg: ip link set eth0 up/down  will disable carrier on/off
       of the corresponding representor(r0p1) interface.
- On representor interface up/down will cause the link state update of VF.
  eg: ip link set r0p1 up/down  will disable carrier on/off
       of the corresponding representee(eth0) interface.

Signed-off-by: Harman Kalra <hkalra@marvell.com>
Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:11 +00:00
Geetha sowjanya
940754a21d octeontx2-pf: Get VF stats via representor
Adds support to export VF port statistics via representor
netdev. Defines new mbox "NIX_LF_STATS" to fetch VF hw stats.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:11 +00:00
Geetha sowjanya
683645a231 octeontx2-af: Add packet path between representor and VF
Current HW, do not support in-built switch which will forward pkts
between representee and representor. When representor is put under
a bridge and pkts needs to be sent to representee, then pkts from
representor are sent on a HW internal loopback channel, which again
will be punted to ingress pkt parser. Now the rules that this patch
installs are the MCAM filters/rules which will match against these
pkts and forward them to representee.
The rules that this patch installs are for basic
representor <=> representee path similar to Tun/TAP between VM and
Host.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:11 +00:00
Geetha sowjanya
22f8587967 octeontx2-pf: Add basic net_device_ops
Implements basic set of net_device_ops.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:11 +00:00
Geetha sowjanya
3937b7308d octeontx2-pf: Create representor netdev
Adds initial devlink support to set/get the switchdev mode.
Representor netdevs are created for each rvu devices when
the switch mode is set to 'switchdev'. These netdevs are
be used to control and configure VFs.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:11 +00:00
Geetha sowjanya
222a4eea9c octeontx2-pf: RVU representor driver
Adds basic driver for the RVU representor.

Driver on probe does pci specific initialization and
does hw resources configuration. Introduces RVU_ESWITCH
kernel config to enable/disable the driver. Representor
and NIC shares the code but representors netdev support
subset of NIC functionality. Hence "otx2_rep_dev" API
helps to skip the features initialization that are not
supported by the representors.

Signed-off-by: Geetha sowjanya <gakula@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-13 11:57:11 +00:00
Jakub Kicinski
ef04d290c0 net: page_pool: do not count normal frag allocation in stats
Commit 0f6deac3a079 ("net: page_pool: add page allocation stats for
two fast page allocate path") added increments for "fast path"
allocation to page frag alloc. It mentions performance degradation
analysis but the details are unclear. Could be that the author
was simply surprised by the alloc stats not matching packet count.

In my experience the key metric for page pool is the recycling rate.
Page return stats, however, count returned _pages_ not frags.
This makes it impossible to calculate recycling rate for drivers
using the frag API. Here is example output of the page-pool
YNL sample for a driver allocating 1200B frags (4k pages)
with nearly perfect recycling:

  $ ./page-pool
    eth0[2]	page pools: 32 (zombies: 0)
		refs: 291648 bytes: 1194590208 (refs: 0 bytes: 0)
		recycling: 33.3% (alloc: 4557:2256365862 recycle: 200476245:551541893)

The recycling rate is reported as 33.3% because we give out
4096 // 1200 = 3 frags for every recycled page.

Effectively revert the aforementioned commit. This also aligns
with the stats we would see for drivers which do the fragmentation
themselves, although that's not a strong reason in itself.

On the (very unlikely) path where we can reuse the current page
let's bump the "cached" stat. The fact that we don't put the page
in the cache is just an optimization.

Acked-by: Jesper Dangaard Brouer <hawk@kernel.org>
Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org>
Link: https://patch.msgid.link/20241109023303.3366500-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-12 18:26:58 -08:00
Jakub Kicinski
7ed816be35 eth: bnxt: use page pool for head frags
Testing small size RPCs (300B-400B) on a large AMD system suggests
that page pool recycling is very useful even for just the head frags.
With this patch (and copy break disabled) I see a 30% performance
improvement (82Gbps -> 106Gbps).

Convert bnxt from normal page frags to page pool frags for head buffers.

On systems with small page size we can use the same pool as for TPA
pages. On systems with large pages the frag allocation logic of the
page pool is already used to split a large page into TPA chunks.
TPA chunks are much larger than heads (8k or 64k, AFAICT vs 1kB)
and we always allocate the same sized chunks. Mixing allocation
of TPA and head pages would lead to sub-optimal memory use.
Plus Taehee's work on zero-copy / devmem will need to differentiate
between TPA and non-TPA page pool, anyway. Conditionally allocate
a new page pool for heads.

Link: https://patch.msgid.link/20241109035119.3391864-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-12 18:26:38 -08:00
Alexandre Ferrieux
73af53d820 net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes.
To generate hnode handles (in gen_new_htid()), u32 uses IDR and
encodes the returned small integer into a structured 32-bit
word. Unfortunately, at disposal time, the needed decoding
is not done. As a result, idr_remove() fails, and the IDR
fills up. Since its size is 2048, the following script ends up
with "Filter already exists":

  tc filter add dev myve $FILTER1
  tc filter add dev myve $FILTER2
  for i in {1..2048}
  do
    echo $i
    tc filter del dev myve $FILTER2
    tc filter add dev myve $FILTER2
  done

This patch adds the missing decoding logic for handles that
deserve it.

Fixes: e7614370d6f0 ("net_sched: use idr to allocate u32 filter handles")
Reviewed-by: Eric Dumazet <edumazet@google.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Signed-off-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Link: https://patch.msgid.link/20241110172836.331319-1-alexandre.ferrieux@orange.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-12 18:26:03 -08:00
Andrew Lunn
078e0d596f dsa: qca8k: Use nested lock to avoid splat
qca8k_phy_eth_command() is used to probe the child MDIO bus while the
parent MDIO is locked. This causes lockdep splat, reporting a possible
deadlock. It is not an actually deadlock, because different locks are
used. By making use of mutex_lock_nested() we can avoid this false
positive.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Link: https://patch.msgid.link/20241110175955.3053664-1-andrew@lunn.ch
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-12 18:25:30 -08:00
Geert Uytterhoeven
2b99b25325 MAINTAINERS: Re-add cancelled Renesas driver sections
Removing full driver sections also removed mailing list entries, causing
submitters of future patches to forget CCing these mailing lists.

Hence re-add the sections for the Renesas Ethernet AVB, R-Car SATA, and
SuperH Ethernet drivers.  Add people who volunteered to maintain these
drivers (thanks a lot!), and mark all of them as supported.

Fixes: 6e90b675cf942e50 ("MAINTAINERS: Remove some entries due to various compliance requirements.")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Niklas Cassel <cassel@kernel.org>
Acked-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Link: https://patch.msgid.link/4b2105332edca277f07ffa195796975e9ddce994.1731319098.git.geert+renesas@glider.be
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-12 18:23:53 -08:00
Dmitry Kandybka
b169e76eba mptcp: fix possible integer overflow in mptcp_reset_tout_timer
In 'mptcp_reset_tout_timer', promote 'probe_timestamp' to unsigned long
to avoid possible integer overflow. Compile tested only.

Found by Linux Verification Center (linuxtesting.org) with SVACE.

Signed-off-by: Dmitry Kandybka <d.kandybka@gmail.com>
Link: https://patch.msgid.link/20241107103657.1560536-1-d.kandybka@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-12 18:11:23 -08:00
Wander Lairson Costa
50d325bb05 Revert "igb: Disable threaded IRQ for igb_msix_other"
This reverts commit 338c4d3902feb5be49bfda530a72c7ab860e2c9f.

Sebastian noticed the ISR indirectly acquires spin_locks, which are
sleeping locks under PREEMPT_RT, which leads to kernel splats.

Fixes: 338c4d3902feb ("igb: Disable threaded IRQ for igb_msix_other")
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Wander Lairson Costa <wander@redhat.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Acked-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241106111427.7272-1-wander@redhat.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-12 18:05:40 -08:00
Jakub Kicinski
e707e366f3 bluetooth pull request for net:
- btintel: Direct exception event to bluetooth stack
  - hci_core: Fix calling mgmt_device_connected
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmczlaIZHGx1aXoudm9u
 LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKTBYD/9guawZa/20oiagUDOutuT/
 i1KCPlVNrMpbeyDyK2sC2ConWOMHdpBToo0/vEwdKbAB/kCbWDh9DjWvaawSUngX
 XPMTnk279WdWOLh6JUb87af1Q4wt8faro63g5gwTmXQrsIED5MlpMQJ2pZAkEmBe
 pQU3QZJjz2BtnFHnVXHLXe53E3P0kWqlrqcAvdeJWRew+0rm++9f187pn9F/kUd0
 F9f4YZgZAmlk56nT5kdv3NSi/cscm5xajlJSG9PlR40n7Un/T6RZXGzl0KeJ+hJw
 DeyMOYBpBnGDOUe/7coqeZH6AulZWzHHIm5UXmqmVMM7KyT0mL/bxSDyXJnv2e6F
 lXBEFNu6o/15N1S8uU6677+wcnbJ1BXwtDSk8iGOXECBN9hoB52NiIx1HPOI5mEX
 dflH8FLe5hZx4b+yktTVBWWcBOd9cMonOxqOWPgfZ4ZbhnEe1SlVf4Qnh5Amq0yt
 ZixbEP7G4k4uWhHvTdwVWIXPxGeBSmn8sQXG1ZSutwLaU9TQYL5W7m0DSnB4xdQB
 h8J2/tdX63Fjm2tpkabb/oRvns9ekjq98QqNGlA2GP7jaqndJmg5ixg8Jhjn9uPF
 OjG9z6OX4yrFFpJP4SXKAl7W3sg2g0yFGLDjoj2h9zVPuIcvbZG1NTzLgytNNXlG
 JcFpsADEZAZcuUotyfFytg==
 =7vO+
 -----END PGP SIGNATURE-----

Merge tag 'for-net-2024-11-12' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth

Luiz Augusto von Dentz says:

====================
bluetooth pull request for net:

 - btintel: Direct exception event to bluetooth stack
 - hci_core: Fix calling mgmt_device_connected

* tag 'for-net-2024-11-12' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth:
  Bluetooth: btintel: Direct exception event to bluetooth stack
  Bluetooth: hci_core: Fix calling mgmt_device_connected
====================

Link: https://patch.msgid.link/20241112175326.930800-1-luiz.dentz@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-12 17:30:42 -08:00
Linus Torvalds
f1b785f4c7 virtio: bugfix
A last minute mlx5 bugfix
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmcz5NMPHG1zdEByZWRo
 YXQuY29tAAoJECgfDbjSjVRptJcIAMuzl2PX/HzYwMBGI7TrRObAaZo8j6Zub9Be
 TVw6H33OKK86y2MoBz1hTj1Z32KA+qAGJEui03ckrFpHSVkzRvNGXJEI2rbtY5sX
 bmP3ch/9Yr4aEw1eF1cpcQlTMyFFFoeqbTLf5qBItsZ+qMfqiknAeSRL31YDBteK
 uOWaTPHMW8nNyy6wQaI9dEdP84Dluhx+B/IxcGcl8FySpSl+faA/uHr5YJP9kTO4
 e7PxFYa0oBeCqu7varkVRHuaoMaPk4OCrjeZWZAY9dp9LOGtfgh/YbYj7wsNfkXH
 mvKy8lRu+o1/Fh6bRc0TxNmtvOPpB1Myto6wj4ntEtNQun3khOo=
 =uDap
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fix from Michael Tsirkin:
 "A last minute mlx5 bugfix"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vdpa/mlx5: Fix PA offset with unaligned starting iotlb map
2024-11-12 16:39:34 -08:00
Johannes Weiner
dcf32ea7ec mm: swapfile: fix cluster reclaim work crash on rotational devices
syzbot and Daan report a NULL pointer crash in the new full swap cluster
reclaim work:

> Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI
> KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f]
> CPU: 1 UID: 0 PID: 51 Comm: kworker/1:1 Not tainted 6.12.0-rc6-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024
> Workqueue: events swap_reclaim_work
> RIP: 0010:__list_del_entry_valid_or_report+0x20/0x1c0 lib/list_debug.c:49
> Code: 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 fe 48 83 c7 08 48 83 ec 18 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 19 01 00 00 48 89 f2 48 8b 4e 08 48 b8 00 00 00
> RSP: 0018:ffffc90000bb7c30 EFLAGS: 00010202
> RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffff88807b9ae078
> RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000008
> RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
> R10: 0000000000000001 R11: 000000000000004f R12: dffffc0000000000
> R13: ffffffffffffffb8 R14: ffff88807b9ae000 R15: ffffc90003af1000
> FS:  0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007fffaca68fb8 CR3: 00000000791c8000 CR4: 00000000003526f0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> Call Trace:
>  <TASK>
>  __list_del_entry_valid include/linux/list.h:124 [inline]
>  __list_del_entry include/linux/list.h:215 [inline]
>  list_move_tail include/linux/list.h:310 [inline]
>  swap_reclaim_full_clusters+0x109/0x460 mm/swapfile.c:748
>  swap_reclaim_work+0x2e/0x40 mm/swapfile.c:779

The syzbot console output indicates a virtual environment where swapfile
is on a rotational device.  In this case, clusters aren't actually used,
and si->full_clusters is not initialized.  Daan's report is from qemu, so
likely rotational too.

Make sure to only schedule the cluster reclaim work when clusters are
actually in use.

Link: https://lkml.kernel.org/r/20241107142335.GB1172372@cmpxchg.org
Link: https://lore.kernel.org/lkml/672ac50b.050a0220.2edce.1517.GAE@google.com/
Link: https://github.com/systemd/systemd/issues/35044
Fixes: 5168a68eb78f ("mm, swap: avoid over reclaim of full clusters")
Reported-by: syzbot+078be8bfa863cb9e0c6b@syzkaller.appspotmail.com
Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
Reported-by: Daan De Meyer <daan.j.demeyer@gmail.com>
Cc: Kairui Song <ryncsn@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-12 16:01:36 -08:00
Si-Wei Liu
29ce8b8a4f vdpa/mlx5: Fix PA offset with unaligned starting iotlb map
When calculating the physical address range based on the iotlb and mr
[start,end) ranges, the offset of mr->start relative to map->start
is not taken into account. This leads to some incorrect and duplicate
mappings.

For the case when mr->start < map->start the code is already correct:
the range in [mr->start, map->start) was handled by a different
iteration.

Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code")
Cc: stable@vger.kernel.org
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Message-Id: <20241021134040.975221-2-dtatulea@nvidia.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2024-11-12 18:05:04 -05:00
Linus Torvalds
14b6320953 KVM x86 and selftests fixes for 6.12:
x86:
 
 - When emulating a guest TLB flush for a nested guest, flush vpid01, not
   vpid02, if L2 is active but VPID is disabled in vmcs12, i.e. if L2 and
   L1 are sharing VPID '0' (from L1's perspective).
 
 - Fix a bug in the SNP initialization flow where KVM would return '0' to
   userspace instead of -errno on failure.
 
 - Move the Intel PT virtualization (i.e. outputting host trace to host
   buffer and guest trace to guest buffer) behind CONFIG_BROKEN.
 
 - Fix memory leak on failure of KVM_SEV_SNP_LAUNCH_START
 
 - Fix a bug where KVM fails to inject an interrupt from the IRR after
   KVM_SET_LAPIC.
 
 Selftests:
 
 - Increase the timeout for the memslot performance selftest to avoid false
   failures on arm64 and nested x86 platforms.
 
 - Fix a goof in the guest_memfd selftest where a for-loop initialized a
   bit mask to zero instead of BIT(0).
 
 - Disable strict aliasing when building KVM selftests to prevent the
   compiler from treating things like "u64 *" to "uint64_t *" cases as
   undefined behavior, which can lead to nasty, hard to debug failures.
 
 - Force -march=x86-64-v2 for KVM x86 selftests if and only if the uarch
   is supported by the compiler.
 
 - Fix broken compilation of kvm selftests after a header sync in tools/
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmczm1QUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroOLKwf+IjkJHZ/LS95HuP/0QLM17Sc4MmiZ
 Pk5gLd5un7BBSLA98RvALR/YPnsA7emEJ34bE/8lQ6R5VSZ5PrIzF+29f60HzRFe
 EDi1/24dqnzdWn50na5nk7A2QhFpfnLQQTl7vMqPFsrU7gfLuHQI6ABp9kloEwP/
 xnjAT683IWNX9v0N2A8kNemy9NNMGssJk1ssDTGzNflSyRNL8cLPGlPkZqAIMsM6
 fHjkDRg0UxasUDkL5CjwnTSdBGoz+/Myyz4unFlYGJB9D3+ev2qDlMqATO4Jfik/
 peJMZ65i8/8/7MgKCTn8qQuT0FLLEvxTuzDHUSGzjMZl0DGaZi2BPETNqg==
 =nW8/
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm fixes from Paolo Bonzini:
 "x86 and selftests fixes.

  x86:

   - When emulating a guest TLB flush for a nested guest, flush vpid01,
     not vpid02, if L2 is active but VPID is disabled in vmcs12, i.e. if
     L2 and L1 are sharing VPID '0' (from L1's perspective).

   - Fix a bug in the SNP initialization flow where KVM would return '0'
     to userspace instead of -errno on failure.

   - Move the Intel PT virtualization (i.e. outputting host trace to
     host buffer and guest trace to guest buffer) behind CONFIG_BROKEN.

   - Fix memory leak on failure of KVM_SEV_SNP_LAUNCH_START

   - Fix a bug where KVM fails to inject an interrupt from the IRR after
     KVM_SET_LAPIC.

  Selftests:

   - Increase the timeout for the memslot performance selftest to avoid
     false failures on arm64 and nested x86 platforms.

   - Fix a goof in the guest_memfd selftest where a for-loop initialized
     a bit mask to zero instead of BIT(0).

   - Disable strict aliasing when building KVM selftests to prevent the
     compiler from treating things like "u64 *" to "uint64_t *" cases as
     undefined behavior, which can lead to nasty, hard to debug
     failures.

   - Force -march=x86-64-v2 for KVM x86 selftests if and only if the
     uarch is supported by the compiler.

   - Fix broken compilation of kvm selftests after a header sync in
     tools/"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
  KVM: VMX: Bury Intel PT virtualization (guest/host mode) behind CONFIG_BROKEN
  KVM: x86: Unconditionally set irr_pending when updating APICv state
  kvm: svm: Fix gctx page leak on invalid inputs
  KVM: selftests: use X86_MEMTYPE_WB instead of VMX_BASIC_MEM_TYPE_WB
  KVM: SVM: Propagate error from snp_guest_req_init() to userspace
  KVM: nVMX: Treat vpid01 as current if L2 is active, but with VPID disabled
  KVM: selftests: Don't force -march=x86-64-v2 if it's unsupported
  KVM: selftests: Disable strict aliasing
  KVM: selftests: fix unintentional noop test in guest_memfd_test.c
  KVM: selftests: memslot_perf_test: increase guest sync timeout
2024-11-12 13:35:13 -08:00
Linus Torvalds
5456ec9dab - fix warnings about duplicate slab cache names
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRnH8MwLyZDhyYfesYTAyx9YGnhbQUCZzItHBQcbXBhdG9ja2FA
 cmVkaGF0LmNvbQAKCRATAyx9YGnhbRzaAQD0c/ERvaahr7yR3fD7b1pyT6g6LpwE
 e2P80QUQRKhCPAD9EXwZo1DdpCNQX7g6eU3jtof9oQoAggAdMjJRc4SF+g4=
 =W477
 -----END PGP SIGNATURE-----

Merge tag 'for-6.12/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm

Pull device mapper fixes from Mikulas Patocka:

 - fix warnings about duplicate slab cache names

* tag 'for-6.12/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm:
  dm-cache: fix warnings about duplicate slab caches
  dm-bufio: fix warnings about duplicate slab caches
2024-11-12 13:21:07 -08:00
Linus Torvalds
93db202ce0 integrity-v6.12
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYKADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZzNj9BQcem9oYXJAbGlu
 dXguaWJtLmNvbQAKCRDLwZzRsCrn5QKDAQCkbTcWVTnMrdz/0hV9JVmoLCFs6GWZ
 cTjaBApOQge1pgD/bTQGJ0fYP6sWEzMPSTMXr6uJaJtlmpsGdPNoOmKUTQU=
 =+K7B
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity fixes from Mimi Zohar:
 "One bug fix, one performance improvement, and the use of
  static_assert:

   - The bug fix addresses "only a cosmetic change" commit, which didn't
     take into account the original 'ima' template definition.

  - The performance improvement limits the atomic_read()"

* tag 'integrity-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  integrity: Use static_assert() to check struct sizes
  evm: stop avoidably reading i_writecount in evm_file_release
  ima: fix buffer overrun in ima_eventdigest_init_common
2024-11-12 13:06:31 -08:00
Linus Torvalds
92dda329e3 Landlock fix for v6.12-rc7
-----BEGIN PGP SIGNATURE-----
 
 iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZy+y6BAcbWljQGRpZ2lr
 b2QubmV0AAoJEOXj0OiMgvbSXcAA/jrpBdfMi6MXbZkOXHw2H46j2jpBpOq67pND
 LqC2LA8bAP9JyiGdFF8ETch59zSa3mpKp4C/k8g/F3XwmSsqLOh3BA==
 =pF7T
 -----END PGP SIGNATURE-----

Merge tag 'landlock-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux

Pull landlock fixes from Mickaël Salaün:
 "This fixes issues in the Landlock's sandboxer sample and
  documentation, slightly refactors helpers (required for ongoing patch
  series), and improve/fix a feature merged in v6.12 (signal and
  abstract UNIX socket scoping)"

* tag 'landlock-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux:
  landlock: Optimize scope enforcement
  landlock: Refactor network access mask management
  landlock: Refactor filesystem access mask management
  samples/landlock: Clarify option parsing behaviour
  samples/landlock: Refactor help message
  samples/landlock: Fix port parsing in sandboxer
  landlock: Fix grammar issues in documentation
  landlock: Improve documentation of previous limitations
2024-11-12 13:01:09 -08:00
Kalle Valo
11597043d7 Revert "wifi: iwlegacy: do not skip frames with bad FCS"
This reverts commit 02b682d54598f61cbb7dbb14d98ec1801112b878.

Alf reports that this commit causes the connection to eventually die on
iwl4965. The reason is that rx_status.flag is zeroed after
RX_FLAG_FAILED_FCS_CRC is set and mac80211 doesn't know the received frame is
corrupted.

Fixes: 02b682d54598 ("wifi: iwlegacy: do not skip frames with bad FCS")
Reported-by: Alf Marius <post@alfmarius.net>
Closes: https://lore.kernel.org/r/60f752e8-787e-44a8-92ae-48bdfc9b43e7@app.fastmail.com/
Signed-off-by: Kalle Valo <kvalo@kernel.org>
Link: https://patch.msgid.link/20241112142419.1023743-1-kvalo@kernel.org
2024-11-12 20:24:45 +02:00
Donet Tom
fae1980347 selftests: hugetlb_dio: fixup check for initial conditions to skip in the start
This test verifies that a hugepage, used as a user buffer for DIO
operations, is correctly freed upon unmapping.  To test this, we read the
count of free hugepages before and after the mmap, DIO, and munmap
operations, then check if the free hugepage count is the same.

Reading free hugepages before the test was removed by commit 0268d4579901
('selftests: hugetlb_dio: check for initial conditions to skip at the
start'), causing the test to always fail.

This patch adds back reading the free hugepages before starting the test. 
With this patch, the tests are now passing.

Test results without this patch:

./tools/testing/selftests/mm/hugetlb_dio
TAP version 13
1..4
 # No. Free pages before allocation : 0
 # No. Free pages after munmap : 100
not ok 1 : Huge pages not freed!
 # No. Free pages before allocation : 0
 # No. Free pages after munmap : 100
not ok 2 : Huge pages not freed!
 # No. Free pages before allocation : 0
 # No. Free pages after munmap : 100
not ok 3 : Huge pages not freed!
 # No. Free pages before allocation : 0
 # No. Free pages after munmap : 100
not ok 4 : Huge pages not freed!
 # Totals: pass:0 fail:4 xfail:0 xpass:0 skip:0 error:0

Test results with this patch:

/tools/testing/selftests/mm/hugetlb_dio
TAP version 13
1..4
# No. Free pages before allocation : 100
# No. Free pages after munmap : 100
ok 1 : Huge pages freed successfully !
# No. Free pages before allocation : 100
# No. Free pages after munmap : 100
ok 2 : Huge pages freed successfully !
# No. Free pages before allocation : 100
# No. Free pages after munmap : 100
ok 3 : Huge pages freed successfully !
# No. Free pages before allocation : 100
# No. Free pages after munmap : 100
ok 4 : Huge pages freed successfully !

# Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0

Link: https://lkml.kernel.org/r/20241110064903.23626-1-donettom@linux.ibm.com
Fixes: 0268d4579901 ("selftests: hugetlb_dio: check for initial conditions to skip in the start")
Signed-off-by: Donet Tom <donettom@linux.ibm.com>
Cc: Muhammad Usama Anjum <usama.anjum@collabora.com>
Cc: Shuah Khan <shuah@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-12 10:14:00 -08:00
Hugh Dickins
a3477c9e02 mm/thp: fix deferred split queue not partially_mapped: fix
Though even more elusive than before, list_del corruption has still been
seen on THP's deferred split queue.

The idea in commit e66f3185fa04 was right, but its implementation wrong. 
The context omitted an important comment just before the critical test:
"split_folio() removes folio from list on success." In ignoring that
comment, when a THP split succeeded, the code went on to release the
preceding safe folio, preserving instead an irrelevant (formerly head)
folio: which gives no safety because it's not on the list.  Fix the logic.

Link: https://lkml.kernel.org/r/3c995a30-31ce-0998-1b9f-3a2cb9354c91@google.com
Fixes: e66f3185fa04 ("mm/thp: fix deferred split queue not partially_mapped")
Signed-off-by: Hugh Dickins <hughd@google.com>
Acked-by: Usama Arif <usamaarif642@gmail.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Barry Song <baohua@kernel.org>
Cc: Chris Li <chrisl@kernel.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Kefeng Wang <wangkefeng.wang@huawei.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Nhat Pham <nphamcs@gmail.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Yang Shi <shy828301@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-12 10:14:00 -08:00
John Hubbard
94efde1d15 mm/gup: avoid an unnecessary allocation call for FOLL_LONGTERM cases
commit 53ba78de064b ("mm/gup: introduce
check_and_migrate_movable_folios()") created a new constraint on the
pin_user_pages*() API family: a potentially large internal allocation must
now occur, for FOLL_LONGTERM cases.

A user-visible consequence has now appeared: user space can no longer pin
more than 2GB of memory anymore on x86_64.  That's because, on a 4KB
PAGE_SIZE system, when user space tries to (indirectly, via a device
driver that calls pin_user_pages()) pin 2GB, this requires an allocation
of a folio pointers array of MAX_PAGE_ORDER size, which is the limit for
kmalloc().

In addition to the directly visible effect described above, there is also
the problem of adding an unnecessary allocation.  The **pages array
argument has already been allocated, and there is no need for a redundant
**folios array allocation in this case.

Fix this by avoiding the new allocation entirely.  This is done by
referring to either the original page[i] within **pages, or to the
associated folio.  Thanks to David Hildenbrand for suggesting this
approach and for providing the initial implementation (which I've tested
and adjusted slightly) as well.

[jhubbard@nvidia.com: whitespace tweak, per David]
  Link: https://lkml.kernel.org/r/131cf9c8-ebc0-4cbb-b722-22fa8527bf3c@nvidia.com
[jhubbard@nvidia.com: bypass pofs_get_folio(), per Oscar]
  Link: https://lkml.kernel.org/r/c1587c7f-9155-45be-bd62-1e36c0dd6923@nvidia.com
Link: https://lkml.kernel.org/r/20241105032944.141488-2-jhubbard@nvidia.com
Fixes: 53ba78de064b ("mm/gup: introduce check_and_migrate_movable_folios()")
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Peter Xu <peterx@redhat.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Cc: Hugh Dickins <hughd@google.com>
Cc: Junxiao Chang <junxiao.chang@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-12 10:14:00 -08:00
Kiran K
d5359a7f58 Bluetooth: btintel: Direct exception event to bluetooth stack
Have exception event part of HCI traces which helps for debug.

snoop traces:
> HCI Event: Vendor (0xff) plen 79
        Vendor Prefix (0x8780)
      Intel Extended Telemetry (0x03)
        Unknown extended telemetry event type (0xde)
        01 01 de
        Unknown extended subevent 0x07
        01 01 de 07 01 de 06 1c ef be ad de ef be ad de
        ef be ad de ef be ad de ef be ad de ef be ad de
        ef be ad de 05 14 ef be ad de ef be ad de ef be
        ad de ef be ad de ef be ad de 43 10 ef be ad de
        ef be ad de ef be ad de ef be ad de

Fixes: af395330abed ("Bluetooth: btintel: Add Intel devcoredump support")
Signed-off-by: Kiran K <kiran.k@intel.com>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-12 11:39:12 -05:00
Luiz Augusto von Dentz
7967dc8f79 Bluetooth: hci_core: Fix calling mgmt_device_connected
Since 61a939c68ee0 ("Bluetooth: Queue incoming ACL data until
BT_CONNECTED state is reached") there is no long the need to call
mgmt_device_connected as ACL data will be queued until BT_CONNECTED
state.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=219458
Link: https://github.com/bluez/bluez/issues/1014
Fixes: 333b4fd11e89 ("Bluetooth: L2CAP: Fix uaf in l2cap_connect")
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
2024-11-12 11:39:12 -05:00
Johannes Berg
f2aadc7212 wifi: mac80211: pass MBSSID config by reference
It's inefficient and confusing to pass the MBSSID config
by value, requiring the whole struct to be copied. Pass
it by reference instead.

Link: https://patch.msgid.link/20241108092227.48fbd8a00112.I64abc1296a7557aadf798d88db931024486ab3b6@changeid
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-11-12 13:46:55 +01:00
MeiChia Chiu
406c5548c6 wifi: mac80211: Support EHT 1024 aggregation size in TX
Support EHT 1024 aggregation size in TX

The 1024 agg size for RX is supported but not for TX.
This patch adds this support and refactors common parsing logics for
addbaext in both process_addba_resp and process_addba_req into a
function.

Reviewed-by: Shayne Chen <shayne.chen@mediatek.com>
Reviewed-by: Money Wang <money.wang@mediatek.com>
Co-developed-by: Peter Chiu <chui-hao.chiu@mediatek.com>
Signed-off-by: Peter Chiu <chui-hao.chiu@mediatek.com>
Signed-off-by: MeiChia Chiu <MeiChia.Chiu@mediatek.com>
Link: https://patch.msgid.link/20241112083846.32063-1-MeiChia.Chiu@mediatek.com
[pass elems/len instead of mgmt/len/is_req]
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-11-12 13:41:45 +01:00
Mingwei Zheng
8251e7621b net: rfkill: gpio: Add check for clk_enable()
Add check for the return value of clk_enable() to catch the potential
error.

Fixes: 7176ba23f8b5 ("net: rfkill: add generic gpio rfkill driver")
Signed-off-by: Mingwei Zheng <zmw12306@gmail.com>
Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com>
Link: https://patch.msgid.link/20241108195341.1853080-1-zmw12306@gmail.com
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
2024-11-12 13:30:31 +01:00
Jakub Kicinski
a58f00ed24 net: sched: cls_api: improve the error message for ID allocation failure
We run into an exhaustion problem with the kernel-allocated filter IDs.
Our allocation problem can be fixed on the user space side,
but the error message in this case was quite misleading:

  "Filter with specified priority/protocol not found" (EINVAL)

Specifically when we can't allocate a _new_ ID because filter with
lowest ID already _exists_, saying "filter not found", is confusing.

Kernel allocates IDs in range of 0xc0000 -> 0x8000, giving out ID one
lower than lowest existing in that range. The error message makes sense
when tcf_chain_tp_find() gets called for GET and DEL but for NEW we
need to provide more specific error messages for all three cases:

 - user wants the ID to be auto-allocated but filter with ID 0x8000
   already exists

 - filter already exists and can be replaced, but user asked
   for a protocol change

 - filter doesn't exist

Caller of tcf_chain_tp_insert_unique() doesn't set extack today,
so don't bother plumbing it in.

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://patch.msgid.link/20241108010254.2995438-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-12 12:58:31 +01:00
Paolo Abeni
20bbe5b802 Merge branch 'virtio-vsock-fix-memory-leaks'
Michal Luczaj says:

====================
virtio/vsock: Fix memory leaks

Short series fixing some memory leaks that I've stumbled upon while toying
with the selftests.

Signed-off-by: Michal Luczaj <mhal@rbox.co>
====================

Link: https://patch.msgid.link/20241107-vsock-mem-leaks-v2-0-4e21bfcfc818@rbox.co
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-12 12:16:55 +01:00
Michal Luczaj
60cf6206a1 virtio/vsock: Improve MSG_ZEROCOPY error handling
Add a missing kfree_skb() to prevent memory leaks.

Fixes: 581512a6dc93 ("vsock/virtio: MSG_ZEROCOPY flag support")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Acked-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-12 12:16:51 +01:00
Michal Luczaj
fbf7085b3a vsock: Fix sk_error_queue memory leak
Kernel queues MSG_ZEROCOPY completion notifications on the error queue.
Where they remain, until explicitly recv()ed. To prevent memory leaks,
clean up the queue when the socket is destroyed.

unreferenced object 0xffff8881028beb00 (size 224):
  comm "vsock_test", pid 1218, jiffies 4294694897
  hex dump (first 32 bytes):
    90 b0 21 17 81 88 ff ff 90 b0 21 17 81 88 ff ff  ..!.......!.....
    00 00 00 00 00 00 00 00 00 b0 21 17 81 88 ff ff  ..........!.....
  backtrace (crc 6c7031ca):
    [<ffffffff81418ef7>] kmem_cache_alloc_node_noprof+0x2f7/0x370
    [<ffffffff81d35882>] __alloc_skb+0x132/0x180
    [<ffffffff81d2d32b>] sock_omalloc+0x4b/0x80
    [<ffffffff81d3a8ae>] msg_zerocopy_realloc+0x9e/0x240
    [<ffffffff81fe5cb2>] virtio_transport_send_pkt_info+0x412/0x4c0
    [<ffffffff81fe6183>] virtio_transport_stream_enqueue+0x43/0x50
    [<ffffffff81fe0813>] vsock_connectible_sendmsg+0x373/0x450
    [<ffffffff81d233d5>] ____sys_sendmsg+0x365/0x3a0
    [<ffffffff81d246f4>] ___sys_sendmsg+0x84/0xd0
    [<ffffffff81d26f47>] __sys_sendmsg+0x47/0x80
    [<ffffffff820d3df3>] do_syscall_64+0x93/0x180
    [<ffffffff8220012b>] entry_SYSCALL_64_after_hwframe+0x76/0x7e

Fixes: 581512a6dc93 ("vsock/virtio: MSG_ZEROCOPY flag support")
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Arseniy Krasnov <avkrasnov@salutedevices.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-12 12:16:51 +01:00
Michal Luczaj
d7b0ff5a86 virtio/vsock: Fix accept_queue memory leak
As the final stages of socket destruction may be delayed, it is possible
that virtio_transport_recv_listen() will be called after the accept_queue
has been flushed, but before the SOCK_DONE flag has been set. As a result,
sockets enqueued after the flush would remain unremoved, leading to a
memory leak.

vsock_release
  __vsock_release
    lock
    virtio_transport_release
      virtio_transport_close
        schedule_delayed_work(close_work)
    sk_shutdown = SHUTDOWN_MASK
(!) flush accept_queue
    release
                                        virtio_transport_recv_pkt
                                          vsock_find_bound_socket
                                          lock
                                          if flag(SOCK_DONE) return
                                          virtio_transport_recv_listen
                                            child = vsock_create_connected
                                      (!)   vsock_enqueue_accept(child)
                                          release
close_work
  lock
  virtio_transport_do_close
    set_flag(SOCK_DONE)
    virtio_transport_remove_sock
      vsock_remove_sock
        vsock_remove_bound
  release

Introduce a sk_shutdown check to disallow vsock_enqueue_accept() during
socket destruction.

unreferenced object 0xffff888109e3f800 (size 2040):
  comm "kworker/5:2", pid 371, jiffies 4294940105
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    28 00 0b 40 00 00 00 00 00 00 00 00 00 00 00 00  (..@............
  backtrace (crc 9e5f4e84):
    [<ffffffff81418ff1>] kmem_cache_alloc_noprof+0x2c1/0x360
    [<ffffffff81d27aa0>] sk_prot_alloc+0x30/0x120
    [<ffffffff81d2b54c>] sk_alloc+0x2c/0x4b0
    [<ffffffff81fe049a>] __vsock_create.constprop.0+0x2a/0x310
    [<ffffffff81fe6d6c>] virtio_transport_recv_pkt+0x4dc/0x9a0
    [<ffffffff81fe745d>] vsock_loopback_work+0xfd/0x140
    [<ffffffff810fc6ac>] process_one_work+0x20c/0x570
    [<ffffffff810fce3f>] worker_thread+0x1bf/0x3a0
    [<ffffffff811070dd>] kthread+0xdd/0x110
    [<ffffffff81044fdd>] ret_from_fork+0x2d/0x50
    [<ffffffff8100785a>] ret_from_fork_asm+0x1a/0x30

Fixes: 3fe356d58efa ("vsock/virtio: discard packets only when socket is really closed")
Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Signed-off-by: Michal Luczaj <mhal@rbox.co>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-12 12:16:51 +01:00
Breno Leitao
12079a59ce net: Implement fault injection forcing skb reallocation
Introduce a fault injection mechanism to force skb reallocation. The
primary goal is to catch bugs related to pointer invalidation after
potential skb reallocation.

The fault injection mechanism aims to identify scenarios where callers
retain pointers to various headers in the skb but fail to reload these
pointers after calling a function that may reallocate the data. This
type of bug can lead to memory corruption or crashes if the old,
now-invalid pointers are used.

By forcing reallocation through fault injection, we can stress-test code
paths and ensure proper pointer management after potential skb
reallocations.

Add a hook for fault injection in the following functions:

 * pskb_trim_rcsum()
 * pskb_may_pull_reason()
 * pskb_trim()

As the other fault injection mechanism, protect it under a debug Kconfig
called CONFIG_FAIL_SKB_REALLOC.

This patch was *heavily* inspired by Jakub's proposal from:
https://lore.kernel.org/all/20240719174140.47a868e6@kernel.org/

CC: Akinobu Mita <akinobu.mita@gmail.com>
Suggested-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
Link: https://patch.msgid.link/20241107-fault_v6-v6-1-1b82cb6ecacd@debian.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-12 12:05:33 +01:00
Paolo Abeni
12f077a728 Merge branch 'net-ip-add-drop-reasons-to-input-route'
Menglong Dong says:

====================
net: ip: add drop reasons to input route

In this series, we mainly add some skb drop reasons to the input path of
ip routing, and we make the following functions return drop reasons:

  fib_validate_source()
  ip_route_input_mc()
  ip_mc_validate_source()
  ip_route_input_slow()
  ip_route_input_rcu()
  ip_route_input_noref()
  ip_route_input()
  ip_mkroute_input()
  __mkroute_input()
  ip_route_use_hint()

And following new skb drop reasons are added:

  SKB_DROP_REASON_IP_LOCAL_SOURCE
  SKB_DROP_REASON_IP_INVALID_SOURCE
  SKB_DROP_REASON_IP_LOCALNET
  SKB_DROP_REASON_IP_INVALID_DEST

Changes since v4:
- in the 6th patch: remove the unneeded "else" in ip_expire()
- in the 8th patch: delete the unneeded comment in __mkroute_input()
- in the 9th patch: replace "return 0" with "return SKB_NOT_DROPPED_YET"
  in ip_route_use_hint()

Changes since v3:
- don't refactor fib_validate_source/__fib_validate_source, and introduce
  a wrapper for fib_validate_source() instead in the 1st patch.
- some small adjustment in the 4-7 patches

Changes since v2:
- refactor fib_validate_source and __fib_validate_source to make
  fib_validate_source return drop reasons
- add the 9th and 10th patches to make this series cover the input route
  code path

Changes since v1:
- make ip_route_input_noref/ip_route_input_rcu/ip_route_input_slow return
  drop reasons, instead of passing a local variable to their function
  arguments.
====================

Link: https://patch.msgid.link/20241107125601.1076814-1-dongml2@chinatelecom.cn
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-12 11:24:53 +01:00
Menglong Dong
479aed04e8 net: ip: make ip_route_use_hint() return drop reasons
In this commit, we make ip_route_use_hint() return drop reasons. The
drop reasons that we return are similar to what we do in
ip_route_input_slow(), and no drop reasons are added in this commit.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-12 11:24:51 +01:00
Menglong Dong
d9340d1e02 net: ip: make ip_mkroute_input/__mkroute_input return drop reasons
In this commit, we make ip_mkroute_input() and __mkroute_input() return
drop reasons.

The drop reason "SKB_DROP_REASON_ARP_PVLAN_DISABLE" is introduced for
the case: the packet which is not IP is forwarded to the in_dev, and
the proxy_arp_pvlan is not enabled.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-12 11:24:51 +01:00