linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-01-15 17:43:59 +00:00

Author	SHA1	Message	Date
SkyLake.Huang	3cb1a3c9cb	net: phy: mediatek: Integrate read/write page helper functions This patch integrates read/write page helper functions as MTK phy lib. They are basically the same in mtk-ge.c & mtk-ge-soc.c. Signed-off-by: SkyLake.Huang <skylake.huang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 13:06:04 +00:00
SkyLake.Huang	477c200aa7	net: phy: mediatek: Improve readability of mtk-phy-lib.c's mtk_phy_led_hw_ctrl_set() This patch removes parens around TRIGGER_NETDEV_RX/TRIGGER_NETDEV_TX in mtk_phy_led_hw_ctrl_set(), which improves readability. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: SkyLake.Huang <skylake.huang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 13:06:04 +00:00
SkyLake.Huang	7f9c320c98	net: phy: mediatek: Move LED helper functions into mtk phy lib This patch creates mtk-phy-lib.c & mtk-phy.h and integrates mtk-ge-soc.c's LED helper functions so that we can use those helper functions in other MTK's ethernet phy driver. Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: SkyLake.Huang <skylake.huang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 13:06:04 +00:00
SkyLake.Huang	4c452f7ea8	net: phy: mediatek: Re-organize MediaTek ethernet phy drivers Re-organize MediaTek ethernet phy driver files and get ready to integrate some common functions and add new 2.5G phy driver. mtk-ge.c: MT7530 Gphy on MT7621 & MT7531 Gphy mtk-ge-soc.c: Built-in Gphy on MT7981 & Built-in switch Gphy on MT7988 mtk-2p5ge.c: Planned for built-in 2.5G phy on MT7988 Reviewed-by: Andrew Lunn <andrew@lunn.ch> Signed-off-by: SkyLake.Huang <skylake.huang@mediatek.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 13:06:04 +00:00
David S. Miller	8545b75bc4	Merge branch 'octeontx2-rvu-rep' Geetha sowjanya says: ==================== Introduce RVU representors This series adds representor support for each rvu devices. When switchdev mode is enabled, representor netdev is registered for each rvu device. In implementation of representor model, one NIX HW LF with multiple SQ and RQ is reserved, where each RQ and SQ of the LF are mapped to a representor. A loopback channel is reserved to support packet path between representors and VFs. CN10K silicon supports 2 types of MACs, RPM and SDP. This patch set adds representor support for both RPM and SDP MAC interfaces. - Patch 1: Implements basic representor driver. - Patch 2: Add devlink support to create representor netdevs that can be used to manage VFs. - Patch 3: Implements basec netdev_ndo_ops. - Patch 4: Installs tcam rules to route packets between representor and VFs. - Patch 5: Enables fetching VF stats via representor interface - Patch 6: Adds support to sync link state between representors and VFs . - Patch 7: Enables configuring VF MTU via representor netdevs. - Patch 8: Adds representors for sdp MAC. - Patch 9: Adds devlink port support. - Patch 10: Implements offload stats. - Patch 11: Implements tc offload support. - patch 12: Adds documentation for rvu port representor. pci/0002:1c:00.0 Command to create PF/VF representor Rpf1vf0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether f6:43:83:ee:26:21 brd ff:ff:ff:ff:ff:ff Rpf1vf1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 12:b2:54:0e:24:54 brd ff:ff:ff:ff:ff:ff Rpf1vf2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 4a:12:c4:4c:32:62 brd ff:ff:ff:ff:ff:ff Rpf1vf3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether ca:cb:68:0e:e2:6e brd ff:ff:ff:ff:ff:ff Rpf2vf0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state DOWN mode DEFAULT group default qlen 1000 link/ether 06:cc:ad:b4:f0:93 brd ff:ff:ff:ff:ff:ff ~# devlink port pci/0002:1c:00.0/0: type eth netdev Rpf1vf0 flavour physical port 0 splittable false pci/0002:1c:00.0/1: type eth netdev Rpf1vf1 flavour pcivf controller 0 pfnum 1 vfnum 1 external false splittable false pci/0002:1c:00.0/2: type eth netdev Rpf1vf2 flavour pcivf controller 0 pfnum 1 vfnum 2 external false splittable false pci/0002:1c:00.0/3: type eth netdev Rpf1vf3 flavour pcivf controller 0 pfnum 1 vfnum 3 external false splittable false ----------- v11:v1: - Submitted refactoring changes as a separate patch set. https://lore.kernel.org/netdev/20241023161843.15543-1-gakula@marvell.com/T/ - Moved documentation to a separate patch. - patch 9: Added code changes to forward updated mac address to VF. - Implemented TC offload support. v10-v11: - As suggested by "Jiri Pirko" adjusted the documentation. - Added more commit description to patch1. v9-v10: - Fixed build warning w.r.t documentation. v8-v9: - Updated the documentation. v7-v8: - Implemented offload stats ndo. - Added documentation. v6-v7: - Rebased on top net-next branch. v5-v6: - Addressed review comments provided by "Simon Horman". - Added review tag. v4-v5: - Patch 3: Removed devm_* usage in rvu_rep_create() - Patch 3: Fixed build warnings. v3-v4: - Patch 2 & 3: Fixed coccinelle reported warnings. - Patch 10: Added devlink port support. v2-v3: - Used extack for error messages. - As suggested reworked commit messages. - Fixed sparse warning. v1-v2: -Fixed build warnings. -Address review comments provided by "Kalesh Anakkur Purayil". ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:12 +00:00
Geetha sowjanya	6050b04dca	Documentation: octeontx2: Add Documentation for RVU representors Adds documentation for creating and configuring rvu port representors Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:12 +00:00
Geetha sowjanya	6c40ca957f	octeontx2-pf: Adds TC offload support Implements tc offload support for rvu representors. Usage example: - Add tc rule to drop packets with vlan id 3 using port representor(Rpf1vf0). # tc filter add dev Rpf1vf0 protocol 802.1Q parent ffff: flower vlan_id 3 vlan_ethtype ipv4 skip_sw action drop - Redirect packets with vlan id 5 and IPv4 packets to eth1, after stripping vlan header. # tc filter add dev Rpf1vf0 ingress protocol 802.1Q flower vlan_id 5 vlan_ethtype ipv4 skip_sw action vlan pop action mirred ingress redirect dev eth1 Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:12 +00:00
Geetha sowjanya	d8dec30b51	octeontx2-pf: Implement offload stats ndo for representors Implement the offload stat ndo by fetching the HW stats of rx/tx queues attached to the representor. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:11 +00:00
Geetha sowjanya	9ed0343f56	octeontx2-pf: Add devlink port support Register devlink port for the rvu representors. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:11 +00:00
Geetha sowjanya	2f7f33a095	octeontx2-pf: Add representors for sdp MAC Hardware supports different types of MACs eg RPM, SDP, LBK. LBK is for internal Tx->Rx HW loopback path. RPM and SDP MACs support ingress/egress pkt IO on interfaces with different set of capabilities like interface modes. At the time of netdev driver registration PF will seek MAC related information from Admin function driver 'drivers/net/ethernet/marvell/octeontx2/af' and sets up ingress/egress queues etc such that pkt IO on the channels of these different MACs is possible. This patch add representors for SDP MAC. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:11 +00:00
Geetha sowjanya	3392f91903	octeontx2-pf: Configure VF mtu via representor Adds support to manage the mtu configuration for VF through representor. On update of representor mtu a mbox notification is send to VF to update its mtu. This feature is implemented based on the "Network Function Representors" kernel documentation. " Setting an MTU on the representor should cause that same MTU to be reported to the representee. " Signed-off-by: Sai Krishna <saikrishnag@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:11 +00:00
Geetha sowjanya	b8fea84a04	octeontx2-pf: Add support to sync link state between representor and VFs Implements the below requirement mentioned in the representors documentation. " The representee's link state is controlled through the representor. Setting the representor administratively UP or DOWN should cause carrier ON or OFF at the representee. " This patch enables - Reflecting the link state of representor based on the VF state and link state of VF based on representor. - On VF interface up/down a notification is sent via mbox to representor to update the link state. eg: ip link set eth0 up/down will disable carrier on/off of the corresponding representor(r0p1) interface. - On representor interface up/down will cause the link state update of VF. eg: ip link set r0p1 up/down will disable carrier on/off of the corresponding representee(eth0) interface. Signed-off-by: Harman Kalra <hkalra@marvell.com> Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:11 +00:00
Geetha sowjanya	940754a21d	octeontx2-pf: Get VF stats via representor Adds support to export VF port statistics via representor netdev. Defines new mbox "NIX_LF_STATS" to fetch VF hw stats. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:11 +00:00
Geetha sowjanya	683645a231	octeontx2-af: Add packet path between representor and VF Current HW, do not support in-built switch which will forward pkts between representee and representor. When representor is put under a bridge and pkts needs to be sent to representee, then pkts from representor are sent on a HW internal loopback channel, which again will be punted to ingress pkt parser. Now the rules that this patch installs are the MCAM filters/rules which will match against these pkts and forward them to representee. The rules that this patch installs are for basic representor <=> representee path similar to Tun/TAP between VM and Host. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:11 +00:00
Geetha sowjanya	22f8587967	octeontx2-pf: Add basic net_device_ops Implements basic set of net_device_ops. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:11 +00:00
Geetha sowjanya	3937b7308d	octeontx2-pf: Create representor netdev Adds initial devlink support to set/get the switchdev mode. Representor netdevs are created for each rvu devices when the switch mode is set to 'switchdev'. These netdevs are be used to control and configure VFs. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:11 +00:00
Geetha sowjanya	222a4eea9c	octeontx2-pf: RVU representor driver Adds basic driver for the RVU representor. Driver on probe does pci specific initialization and does hw resources configuration. Introduces RVU_ESWITCH kernel config to enable/disable the driver. Representor and NIC shares the code but representors netdev support subset of NIC functionality. Hence "otx2_rep_dev" API helps to skip the features initialization that are not supported by the representors. Signed-off-by: Geetha sowjanya <gakula@marvell.com> Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2024-11-13 11:57:11 +00:00
Jakub Kicinski	ef04d290c0	net: page_pool: do not count normal frag allocation in stats Commit 0f6deac3a079 ("net: page_pool: add page allocation stats for two fast page allocate path") added increments for "fast path" allocation to page frag alloc. It mentions performance degradation analysis but the details are unclear. Could be that the author was simply surprised by the alloc stats not matching packet count. In my experience the key metric for page pool is the recycling rate. Page return stats, however, count returned _pages_ not frags. This makes it impossible to calculate recycling rate for drivers using the frag API. Here is example output of the page-pool YNL sample for a driver allocating 1200B frags (4k pages) with nearly perfect recycling: $ ./page-pool eth0[2] page pools: 32 (zombies: 0) refs: 291648 bytes: 1194590208 (refs: 0 bytes: 0) recycling: 33.3% (alloc: 4557:2256365862 recycle: 200476245:551541893) The recycling rate is reported as 33.3% because we give out 4096 // 1200 = 3 frags for every recycled page. Effectively revert the aforementioned commit. This also aligns with the stats we would see for drivers which do the fragmentation themselves, although that's not a strong reason in itself. On the (very unlikely) path where we can reuse the current page let's bump the "cached" stat. The fact that we don't put the page in the cache is just an optimization. Acked-by: Jesper Dangaard Brouer <hawk@kernel.org> Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com> Acked-by: Ilias Apalodimas <ilias.apalodimas@linaro.org> Link: https://patch.msgid.link/20241109023303.3366500-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-11-12 18:26:58 -08:00
Jakub Kicinski	7ed816be35	eth: bnxt: use page pool for head frags Testing small size RPCs (300B-400B) on a large AMD system suggests that page pool recycling is very useful even for just the head frags. With this patch (and copy break disabled) I see a 30% performance improvement (82Gbps -> 106Gbps). Convert bnxt from normal page frags to page pool frags for head buffers. On systems with small page size we can use the same pool as for TPA pages. On systems with large pages the frag allocation logic of the page pool is already used to split a large page into TPA chunks. TPA chunks are much larger than heads (8k or 64k, AFAICT vs 1kB) and we always allocate the same sized chunks. Mixing allocation of TPA and head pages would lead to sub-optimal memory use. Plus Taehee's work on zero-copy / devmem will need to differentiate between TPA and non-TPA page pool, anyway. Conditionally allocate a new page pool for heads. Link: https://patch.msgid.link/20241109035119.3391864-1-kuba@kernel.org Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-11-12 18:26:38 -08:00
Alexandre Ferrieux	73af53d820	net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes. To generate hnode handles (in gen_new_htid()), u32 uses IDR and encodes the returned small integer into a structured 32-bit word. Unfortunately, at disposal time, the needed decoding is not done. As a result, idr_remove() fails, and the IDR fills up. Since its size is 2048, the following script ends up with "Filter already exists": tc filter add dev myve $FILTER1 tc filter add dev myve $FILTER2 for i in {1..2048} do echo $i tc filter del dev myve $FILTER2 tc filter add dev myve $FILTER2 done This patch adds the missing decoding logic for handles that deserve it. Fixes: e7614370d6f0 ("net_sched: use idr to allocate u32 filter handles") Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Signed-off-by: Alexandre Ferrieux <alexandre.ferrieux@orange.com> Tested-by: Victor Nogueira <victor@mojatatu.com> Link: https://patch.msgid.link/20241110172836.331319-1-alexandre.ferrieux@orange.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-11-12 18:26:03 -08:00
Andrew Lunn	078e0d596f	dsa: qca8k: Use nested lock to avoid splat qca8k_phy_eth_command() is used to probe the child MDIO bus while the parent MDIO is locked. This causes lockdep splat, reporting a possible deadlock. It is not an actually deadlock, because different locks are used. By making use of mutex_lock_nested() we can avoid this false positive. Signed-off-by: Andrew Lunn <andrew@lunn.ch> Link: https://patch.msgid.link/20241110175955.3053664-1-andrew@lunn.ch Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-11-12 18:25:30 -08:00
Geert Uytterhoeven	2b99b25325	MAINTAINERS: Re-add cancelled Renesas driver sections Removing full driver sections also removed mailing list entries, causing submitters of future patches to forget CCing these mailing lists. Hence re-add the sections for the Renesas Ethernet AVB, R-Car SATA, and SuperH Ethernet drivers. Add people who volunteered to maintain these drivers (thanks a lot!), and mark all of them as supported. Fixes: 6e90b675cf942e50 ("MAINTAINERS: Remove some entries due to various compliance requirements.") Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Reviewed-by: Simon Horman <horms@kernel.org> Acked-by: Niklas Cassel <cassel@kernel.org> Acked-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se> Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com> Link: https://patch.msgid.link/4b2105332edca277f07ffa195796975e9ddce994.1731319098.git.geert+renesas@glider.be Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-11-12 18:23:53 -08:00
Dmitry Kandybka	b169e76eba	mptcp: fix possible integer overflow in mptcp_reset_tout_timer In 'mptcp_reset_tout_timer', promote 'probe_timestamp' to unsigned long to avoid possible integer overflow. Compile tested only. Found by Linux Verification Center (linuxtesting.org) with SVACE. Signed-off-by: Dmitry Kandybka <d.kandybka@gmail.com> Link: https://patch.msgid.link/20241107103657.1560536-1-d.kandybka@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-11-12 18:11:23 -08:00
Wander Lairson Costa	50d325bb05	Revert "igb: Disable threaded IRQ for igb_msix_other" This reverts commit 338c4d3902feb5be49bfda530a72c7ab860e2c9f. Sebastian noticed the ISR indirectly acquires spin_locks, which are sleeping locks under PREEMPT_RT, which leads to kernel splats. Fixes: 338c4d3902feb ("igb: Disable threaded IRQ for igb_msix_other") Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Signed-off-by: Wander Lairson Costa <wander@redhat.com> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Acked-by: Przemek Kitszel <przemyslaw.kitszel@intel.com> Link: https://patch.msgid.link/20241106111427.7272-1-wander@redhat.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-11-12 18:05:40 -08:00
Jakub Kicinski	e707e366f3	bluetooth pull request for net: - btintel: Direct exception event to bluetooth stack - hci_core: Fix calling mgmt_device_connected -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE7E6oRXp8w05ovYr/9JCA4xAyCykFAmczlaIZHGx1aXoudm9u LmRlbnR6QGludGVsLmNvbQAKCRD0kIDjEDILKTBYD/9guawZa/20oiagUDOutuT/ i1KCPlVNrMpbeyDyK2sC2ConWOMHdpBToo0/vEwdKbAB/kCbWDh9DjWvaawSUngX XPMTnk279WdWOLh6JUb87af1Q4wt8faro63g5gwTmXQrsIED5MlpMQJ2pZAkEmBe pQU3QZJjz2BtnFHnVXHLXe53E3P0kWqlrqcAvdeJWRew+0rm++9f187pn9F/kUd0 F9f4YZgZAmlk56nT5kdv3NSi/cscm5xajlJSG9PlR40n7Un/T6RZXGzl0KeJ+hJw DeyMOYBpBnGDOUe/7coqeZH6AulZWzHHIm5UXmqmVMM7KyT0mL/bxSDyXJnv2e6F lXBEFNu6o/15N1S8uU6677+wcnbJ1BXwtDSk8iGOXECBN9hoB52NiIx1HPOI5mEX dflH8FLe5hZx4b+yktTVBWWcBOd9cMonOxqOWPgfZ4ZbhnEe1SlVf4Qnh5Amq0yt ZixbEP7G4k4uWhHvTdwVWIXPxGeBSmn8sQXG1ZSutwLaU9TQYL5W7m0DSnB4xdQB h8J2/tdX63Fjm2tpkabb/oRvns9ekjq98QqNGlA2GP7jaqndJmg5ixg8Jhjn9uPF OjG9z6OX4yrFFpJP4SXKAl7W3sg2g0yFGLDjoj2h9zVPuIcvbZG1NTzLgytNNXlG JcFpsADEZAZcuUotyfFytg== =7vO+ -----END PGP SIGNATURE----- Merge tag 'for-net-2024-11-12' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Luiz Augusto von Dentz says: ==================== bluetooth pull request for net: - btintel: Direct exception event to bluetooth stack - hci_core: Fix calling mgmt_device_connected * tag 'for-net-2024-11-12' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth: Bluetooth: btintel: Direct exception event to bluetooth stack Bluetooth: hci_core: Fix calling mgmt_device_connected ==================== Link: https://patch.msgid.link/20241112175326.930800-1-luiz.dentz@gmail.com Signed-off-by: Jakub Kicinski <kuba@kernel.org>	2024-11-12 17:30:42 -08:00
Linus Torvalds	f1b785f4c7	virtio: bugfix A last minute mlx5 bugfix Signed-off-by: Michael S. Tsirkin <mst@redhat.com> -----BEGIN PGP SIGNATURE----- iQFDBAABCAAtFiEEXQn9CHHI+FuUyooNKB8NuNKNVGkFAmcz5NMPHG1zdEByZWRo YXQuY29tAAoJECgfDbjSjVRptJcIAMuzl2PX/HzYwMBGI7TrRObAaZo8j6Zub9Be TVw6H33OKK86y2MoBz1hTj1Z32KA+qAGJEui03ckrFpHSVkzRvNGXJEI2rbtY5sX bmP3ch/9Yr4aEw1eF1cpcQlTMyFFFoeqbTLf5qBItsZ+qMfqiknAeSRL31YDBteK uOWaTPHMW8nNyy6wQaI9dEdP84Dluhx+B/IxcGcl8FySpSl+faA/uHr5YJP9kTO4 e7PxFYa0oBeCqu7varkVRHuaoMaPk4OCrjeZWZAY9dp9LOGtfgh/YbYj7wsNfkXH mvKy8lRu+o1/Fh6bRc0TxNmtvOPpB1Myto6wj4ntEtNQun3khOo= =uDap -----END PGP SIGNATURE----- Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost Pull virtio fix from Michael Tsirkin: "A last minute mlx5 bugfix" * tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost: vdpa/mlx5: Fix PA offset with unaligned starting iotlb map	2024-11-12 16:39:34 -08:00
Johannes Weiner	dcf32ea7ec	mm: swapfile: fix cluster reclaim work crash on rotational devices syzbot and Daan report a NULL pointer crash in the new full swap cluster reclaim work: > Oops: general protection fault, probably for non-canonical address 0xdffffc0000000001: 0000 [#1] PREEMPT SMP KASAN PTI > KASAN: null-ptr-deref in range [0x0000000000000008-0x000000000000000f] > CPU: 1 UID: 0 PID: 51 Comm: kworker/1:1 Not tainted 6.12.0-rc6-syzkaller #0 > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 > Workqueue: events swap_reclaim_work > RIP: 0010:__list_del_entry_valid_or_report+0x20/0x1c0 lib/list_debug.c:49 > Code: 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 48 89 fe 48 83 c7 08 48 83 ec 18 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <80> 3c 02 00 0f 85 19 01 00 00 48 89 f2 48 8b 4e 08 48 b8 00 00 00 > RSP: 0018:ffffc90000bb7c30 EFLAGS: 00010202 > RAX: dffffc0000000000 RBX: 0000000000000000 RCX: ffff88807b9ae078 > RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000008 > RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000 > R10: 0000000000000001 R11: 000000000000004f R12: dffffc0000000000 > R13: ffffffffffffffb8 R14: ffff88807b9ae000 R15: ffffc90003af1000 > FS: 0000000000000000(0000) GS:ffff8880b8700000(0000) knlGS:0000000000000000 > CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > CR2: 00007fffaca68fb8 CR3: 00000000791c8000 CR4: 00000000003526f0 > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > Call Trace: > <TASK> > __list_del_entry_valid include/linux/list.h:124 [inline] > __list_del_entry include/linux/list.h:215 [inline] > list_move_tail include/linux/list.h:310 [inline] > swap_reclaim_full_clusters+0x109/0x460 mm/swapfile.c:748 > swap_reclaim_work+0x2e/0x40 mm/swapfile.c:779 The syzbot console output indicates a virtual environment where swapfile is on a rotational device. In this case, clusters aren't actually used, and si->full_clusters is not initialized. Daan's report is from qemu, so likely rotational too. Make sure to only schedule the cluster reclaim work when clusters are actually in use. Link: https://lkml.kernel.org/r/20241107142335.GB1172372@cmpxchg.org Link: https://lore.kernel.org/lkml/672ac50b.050a0220.2edce.1517.GAE@google.com/ Link: https://github.com/systemd/systemd/issues/35044 Fixes: 5168a68eb78f ("mm, swap: avoid over reclaim of full clusters") Reported-by: syzbot+078be8bfa863cb9e0c6b@syzkaller.appspotmail.com Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Reported-by: Daan De Meyer <daan.j.demeyer@gmail.com> Cc: Kairui Song <ryncsn@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-11-12 16:01:36 -08:00
Si-Wei Liu	29ce8b8a4f	vdpa/mlx5: Fix PA offset with unaligned starting iotlb map When calculating the physical address range based on the iotlb and mr [start,end) ranges, the offset of mr->start relative to map->start is not taken into account. This leads to some incorrect and duplicate mappings. For the case when mr->start < map->start the code is already correct: the range in [mr->start, map->start) was handled by a different iteration. Fixes: 94abbccdf291 ("vdpa/mlx5: Add shared memory registration code") Cc: stable@vger.kernel.org Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com> Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com> Message-Id: <20241021134040.975221-2-dtatulea@nvidia.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com>	2024-11-12 18:05:04 -05:00
Linus Torvalds	14b6320953	KVM x86 and selftests fixes for 6.12: x86: - When emulating a guest TLB flush for a nested guest, flush vpid01, not vpid02, if L2 is active but VPID is disabled in vmcs12, i.e. if L2 and L1 are sharing VPID '0' (from L1's perspective). - Fix a bug in the SNP initialization flow where KVM would return '0' to userspace instead of -errno on failure. - Move the Intel PT virtualization (i.e. outputting host trace to host buffer and guest trace to guest buffer) behind CONFIG_BROKEN. - Fix memory leak on failure of KVM_SEV_SNP_LAUNCH_START - Fix a bug where KVM fails to inject an interrupt from the IRR after KVM_SET_LAPIC. Selftests: - Increase the timeout for the memslot performance selftest to avoid false failures on arm64 and nested x86 platforms. - Fix a goof in the guest_memfd selftest where a for-loop initialized a bit mask to zero instead of BIT(0). - Disable strict aliasing when building KVM selftests to prevent the compiler from treating things like "u64 " to "uint64_t " cases as undefined behavior, which can lead to nasty, hard to debug failures. - Force -march=x86-64-v2 for KVM x86 selftests if and only if the uarch is supported by the compiler. - Fix broken compilation of kvm selftests after a header sync in tools/ -----BEGIN PGP SIGNATURE----- iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmczm1QUHHBib256aW5p QHJlZGhhdC5jb20ACgkQv/vSX3jHroOLKwf+IjkJHZ/LS95HuP/0QLM17Sc4MmiZ Pk5gLd5un7BBSLA98RvALR/YPnsA7emEJ34bE/8lQ6R5VSZ5PrIzF+29f60HzRFe EDi1/24dqnzdWn50na5nk7A2QhFpfnLQQTl7vMqPFsrU7gfLuHQI6ABp9kloEwP/ xnjAT683IWNX9v0N2A8kNemy9NNMGssJk1ssDTGzNflSyRNL8cLPGlPkZqAIMsM6 fHjkDRg0UxasUDkL5CjwnTSdBGoz+/Myyz4unFlYGJB9D3+ev2qDlMqATO4Jfik/ peJMZ65i8/8/7MgKCTn8qQuT0FLLEvxTuzDHUSGzjMZl0DGaZi2BPETNqg== =nW8/ -----END PGP SIGNATURE----- Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm Pull kvm fixes from Paolo Bonzini: "x86 and selftests fixes. x86: - When emulating a guest TLB flush for a nested guest, flush vpid01, not vpid02, if L2 is active but VPID is disabled in vmcs12, i.e. if L2 and L1 are sharing VPID '0' (from L1's perspective). - Fix a bug in the SNP initialization flow where KVM would return '0' to userspace instead of -errno on failure. - Move the Intel PT virtualization (i.e. outputting host trace to host buffer and guest trace to guest buffer) behind CONFIG_BROKEN. - Fix memory leak on failure of KVM_SEV_SNP_LAUNCH_START - Fix a bug where KVM fails to inject an interrupt from the IRR after KVM_SET_LAPIC. Selftests: - Increase the timeout for the memslot performance selftest to avoid false failures on arm64 and nested x86 platforms. - Fix a goof in the guest_memfd selftest where a for-loop initialized a bit mask to zero instead of BIT(0). - Disable strict aliasing when building KVM selftests to prevent the compiler from treating things like "u64 " to "uint64_t " cases as undefined behavior, which can lead to nasty, hard to debug failures. - Force -march=x86-64-v2 for KVM x86 selftests if and only if the uarch is supported by the compiler. - Fix broken compilation of kvm selftests after a header sync in tools/" * tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: KVM: VMX: Bury Intel PT virtualization (guest/host mode) behind CONFIG_BROKEN KVM: x86: Unconditionally set irr_pending when updating APICv state kvm: svm: Fix gctx page leak on invalid inputs KVM: selftests: use X86_MEMTYPE_WB instead of VMX_BASIC_MEM_TYPE_WB KVM: SVM: Propagate error from snp_guest_req_init() to userspace KVM: nVMX: Treat vpid01 as current if L2 is active, but with VPID disabled KVM: selftests: Don't force -march=x86-64-v2 if it's unsupported KVM: selftests: Disable strict aliasing KVM: selftests: fix unintentional noop test in guest_memfd_test.c KVM: selftests: memslot_perf_test: increase guest sync timeout	2024-11-12 13:35:13 -08:00
Linus Torvalds	5456ec9dab	- fix warnings about duplicate slab cache names -----BEGIN PGP SIGNATURE----- iIoEABYIADIWIQRnH8MwLyZDhyYfesYTAyx9YGnhbQUCZzItHBQcbXBhdG9ja2FA cmVkaGF0LmNvbQAKCRATAyx9YGnhbRzaAQD0c/ERvaahr7yR3fD7b1pyT6g6LpwE e2P80QUQRKhCPAD9EXwZo1DdpCNQX7g6eU3jtof9oQoAggAdMjJRc4SF+g4= =W477 -----END PGP SIGNATURE----- Merge tag 'for-6.12/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm Pull device mapper fixes from Mikulas Patocka: - fix warnings about duplicate slab cache names * tag 'for-6.12/dm-fixes-3' of git://git.kernel.org/pub/scm/linux/kernel/git/device-mapper/linux-dm: dm-cache: fix warnings about duplicate slab caches dm-bufio: fix warnings about duplicate slab caches	2024-11-12 13:21:07 -08:00
Linus Torvalds	93db202ce0	integrity-v6.12 -----BEGIN PGP SIGNATURE----- iIoEABYKADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZzNj9BQcem9oYXJAbGlu dXguaWJtLmNvbQAKCRDLwZzRsCrn5QKDAQCkbTcWVTnMrdz/0hV9JVmoLCFs6GWZ cTjaBApOQge1pgD/bTQGJ0fYP6sWEzMPSTMXr6uJaJtlmpsGdPNoOmKUTQU= =+K7B -----END PGP SIGNATURE----- Merge tag 'integrity-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity Pull integrity fixes from Mimi Zohar: "One bug fix, one performance improvement, and the use of static_assert: - The bug fix addresses "only a cosmetic change" commit, which didn't take into account the original 'ima' template definition. - The performance improvement limits the atomic_read()" * tag 'integrity-v6.12' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity: integrity: Use static_assert() to check struct sizes evm: stop avoidably reading i_writecount in evm_file_release ima: fix buffer overrun in ima_eventdigest_init_common	2024-11-12 13:06:31 -08:00
Linus Torvalds	92dda329e3	Landlock fix for v6.12-rc7 -----BEGIN PGP SIGNATURE----- iIYEABYKAC4WIQSVyBthFV4iTW/VU1/l49DojIL20gUCZy+y6BAcbWljQGRpZ2lr b2QubmV0AAoJEOXj0OiMgvbSXcAA/jrpBdfMi6MXbZkOXHw2H46j2jpBpOq67pND LqC2LA8bAP9JyiGdFF8ETch59zSa3mpKp4C/k8g/F3XwmSsqLOh3BA== =pF7T -----END PGP SIGNATURE----- Merge tag 'landlock-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux Pull landlock fixes from Mickaël Salaün: "This fixes issues in the Landlock's sandboxer sample and documentation, slightly refactors helpers (required for ongoing patch series), and improve/fix a feature merged in v6.12 (signal and abstract UNIX socket scoping)" * tag 'landlock-6.12-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/mic/linux: landlock: Optimize scope enforcement landlock: Refactor network access mask management landlock: Refactor filesystem access mask management samples/landlock: Clarify option parsing behaviour samples/landlock: Refactor help message samples/landlock: Fix port parsing in sandboxer landlock: Fix grammar issues in documentation landlock: Improve documentation of previous limitations	2024-11-12 13:01:09 -08:00
Kalle Valo	11597043d7	Revert "wifi: iwlegacy: do not skip frames with bad FCS" This reverts commit 02b682d54598f61cbb7dbb14d98ec1801112b878. Alf reports that this commit causes the connection to eventually die on iwl4965. The reason is that rx_status.flag is zeroed after RX_FLAG_FAILED_FCS_CRC is set and mac80211 doesn't know the received frame is corrupted. Fixes: 02b682d54598 ("wifi: iwlegacy: do not skip frames with bad FCS") Reported-by: Alf Marius <post@alfmarius.net> Closes: https://lore.kernel.org/r/60f752e8-787e-44a8-92ae-48bdfc9b43e7@app.fastmail.com/ Signed-off-by: Kalle Valo <kvalo@kernel.org> Link: https://patch.msgid.link/20241112142419.1023743-1-kvalo@kernel.org	2024-11-12 20:24:45 +02:00
Donet Tom	fae1980347	selftests: hugetlb_dio: fixup check for initial conditions to skip in the start This test verifies that a hugepage, used as a user buffer for DIO operations, is correctly freed upon unmapping. To test this, we read the count of free hugepages before and after the mmap, DIO, and munmap operations, then check if the free hugepage count is the same. Reading free hugepages before the test was removed by commit 0268d4579901 ('selftests: hugetlb_dio: check for initial conditions to skip at the start'), causing the test to always fail. This patch adds back reading the free hugepages before starting the test. With this patch, the tests are now passing. Test results without this patch: ./tools/testing/selftests/mm/hugetlb_dio TAP version 13 1..4 # No. Free pages before allocation : 0 # No. Free pages after munmap : 100 not ok 1 : Huge pages not freed! # No. Free pages before allocation : 0 # No. Free pages after munmap : 100 not ok 2 : Huge pages not freed! # No. Free pages before allocation : 0 # No. Free pages after munmap : 100 not ok 3 : Huge pages not freed! # No. Free pages before allocation : 0 # No. Free pages after munmap : 100 not ok 4 : Huge pages not freed! # Totals: pass:0 fail:4 xfail:0 xpass:0 skip:0 error:0 Test results with this patch: /tools/testing/selftests/mm/hugetlb_dio TAP version 13 1..4 # No. Free pages before allocation : 100 # No. Free pages after munmap : 100 ok 1 : Huge pages freed successfully ! # No. Free pages before allocation : 100 # No. Free pages after munmap : 100 ok 2 : Huge pages freed successfully ! # No. Free pages before allocation : 100 # No. Free pages after munmap : 100 ok 3 : Huge pages freed successfully ! # No. Free pages before allocation : 100 # No. Free pages after munmap : 100 ok 4 : Huge pages freed successfully ! # Totals: pass:4 fail:0 xfail:0 xpass:0 skip:0 error:0 Link: https://lkml.kernel.org/r/20241110064903.23626-1-donettom@linux.ibm.com Fixes: 0268d4579901 ("selftests: hugetlb_dio: check for initial conditions to skip in the start") Signed-off-by: Donet Tom <donettom@linux.ibm.com> Cc: Muhammad Usama Anjum <usama.anjum@collabora.com> Cc: Shuah Khan <shuah@kernel.org> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-11-12 10:14:00 -08:00
Hugh Dickins	a3477c9e02	mm/thp: fix deferred split queue not partially_mapped: fix Though even more elusive than before, list_del corruption has still been seen on THP's deferred split queue. The idea in commit e66f3185fa04 was right, but its implementation wrong. The context omitted an important comment just before the critical test: "split_folio() removes folio from list on success." In ignoring that comment, when a THP split succeeded, the code went on to release the preceding safe folio, preserving instead an irrelevant (formerly head) folio: which gives no safety because it's not on the list. Fix the logic. Link: https://lkml.kernel.org/r/3c995a30-31ce-0998-1b9f-3a2cb9354c91@google.com Fixes: e66f3185fa04 ("mm/thp: fix deferred split queue not partially_mapped") Signed-off-by: Hugh Dickins <hughd@google.com> Acked-by: Usama Arif <usamaarif642@gmail.com> Reviewed-by: Zi Yan <ziy@nvidia.com> Cc: Baolin Wang <baolin.wang@linux.alibaba.com> Cc: Barry Song <baohua@kernel.org> Cc: Chris Li <chrisl@kernel.org> Cc: David Hildenbrand <david@redhat.com> Cc: Johannes Weiner <hannes@cmpxchg.org> Cc: Kefeng Wang <wangkefeng.wang@huawei.com> Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Nhat Pham <nphamcs@gmail.com> Cc: Ryan Roberts <ryan.roberts@arm.com> Cc: Shakeel Butt <shakeel.butt@linux.dev> Cc: Wei Yang <richard.weiyang@gmail.com> Cc: Yang Shi <shy828301@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-11-12 10:14:00 -08:00
John Hubbard	94efde1d15	mm/gup: avoid an unnecessary allocation call for FOLL_LONGTERM cases commit 53ba78de064b ("mm/gup: introduce check_and_migrate_movable_folios()") created a new constraint on the pin_user_pages() API family: a potentially large internal allocation must now occur, for FOLL_LONGTERM cases. A user-visible consequence has now appeared: user space can no longer pin more than 2GB of memory anymore on x86_64. That's because, on a 4KB PAGE_SIZE system, when user space tries to (indirectly, via a device driver that calls pin_user_pages()) pin 2GB, this requires an allocation of a folio pointers array of MAX_PAGE_ORDER size, which is the limit for kmalloc(). In addition to the directly visible effect described above, there is also the problem of adding an unnecessary allocation. The pages array argument has already been allocated, and there is no need for a redundant folios array allocation in this case. Fix this by avoiding the new allocation entirely. This is done by referring to either the original page[i] within *pages, or to the associated folio. Thanks to David Hildenbrand for suggesting this approach and for providing the initial implementation (which I've tested and adjusted slightly) as well. [jhubbard@nvidia.com: whitespace tweak, per David] Link: https://lkml.kernel.org/r/131cf9c8-ebc0-4cbb-b722-22fa8527bf3c@nvidia.com [jhubbard@nvidia.com: bypass pofs_get_folio(), per Oscar] Link: https://lkml.kernel.org/r/c1587c7f-9155-45be-bd62-1e36c0dd6923@nvidia.com Link: https://lkml.kernel.org/r/20241105032944.141488-2-jhubbard@nvidia.com Fixes: 53ba78de064b ("mm/gup: introduce check_and_migrate_movable_folios()") Signed-off-by: John Hubbard <jhubbard@nvidia.com> Suggested-by: David Hildenbrand <david@redhat.com> Acked-by: David Hildenbrand <david@redhat.com> Reviewed-by: Oscar Salvador <osalvador@suse.de> Cc: Vivek Kasireddy <vivek.kasireddy@intel.com> Cc: Dave Airlie <airlied@redhat.com> Cc: Gerd Hoffmann <kraxel@redhat.com> Cc: Matthew Wilcox <willy@infradead.org> Cc: Christoph Hellwig <hch@infradead.org> Cc: Jason Gunthorpe <jgg@nvidia.com> Cc: Peter Xu <peterx@redhat.com> Cc: Arnd Bergmann <arnd@arndb.de> Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Dongwon Kim <dongwon.kim@intel.com> Cc: Hugh Dickins <hughd@google.com> Cc: Junxiao Chang <junxiao.chang@intel.com> Cc: <stable@vger.kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>	2024-11-12 10:14:00 -08:00
Kiran K	d5359a7f58	Bluetooth: btintel: Direct exception event to bluetooth stack Have exception event part of HCI traces which helps for debug. snoop traces: > HCI Event: Vendor (0xff) plen 79 Vendor Prefix (0x8780) Intel Extended Telemetry (0x03) Unknown extended telemetry event type (0xde) 01 01 de Unknown extended subevent 0x07 01 01 de 07 01 de 06 1c ef be ad de ef be ad de ef be ad de ef be ad de ef be ad de ef be ad de ef be ad de 05 14 ef be ad de ef be ad de ef be ad de ef be ad de ef be ad de 43 10 ef be ad de ef be ad de ef be ad de ef be ad de Fixes: af395330abed ("Bluetooth: btintel: Add Intel devcoredump support") Signed-off-by: Kiran K <kiran.k@intel.com> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-11-12 11:39:12 -05:00
Luiz Augusto von Dentz	7967dc8f79	Bluetooth: hci_core: Fix calling mgmt_device_connected Since 61a939c68ee0 ("Bluetooth: Queue incoming ACL data until BT_CONNECTED state is reached") there is no long the need to call mgmt_device_connected as ACL data will be queued until BT_CONNECTED state. Link: https://bugzilla.kernel.org/show_bug.cgi?id=219458 Link: https://github.com/bluez/bluez/issues/1014 Fixes: 333b4fd11e89 ("Bluetooth: L2CAP: Fix uaf in l2cap_connect") Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>	2024-11-12 11:39:12 -05:00
Johannes Berg	f2aadc7212	wifi: mac80211: pass MBSSID config by reference It's inefficient and confusing to pass the MBSSID config by value, requiring the whole struct to be copied. Pass it by reference instead. Link: https://patch.msgid.link/20241108092227.48fbd8a00112.I64abc1296a7557aadf798d88db931024486ab3b6@changeid Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-11-12 13:46:55 +01:00
MeiChia Chiu	406c5548c6	wifi: mac80211: Support EHT 1024 aggregation size in TX Support EHT 1024 aggregation size in TX The 1024 agg size for RX is supported but not for TX. This patch adds this support and refactors common parsing logics for addbaext in both process_addba_resp and process_addba_req into a function. Reviewed-by: Shayne Chen <shayne.chen@mediatek.com> Reviewed-by: Money Wang <money.wang@mediatek.com> Co-developed-by: Peter Chiu <chui-hao.chiu@mediatek.com> Signed-off-by: Peter Chiu <chui-hao.chiu@mediatek.com> Signed-off-by: MeiChia Chiu <MeiChia.Chiu@mediatek.com> Link: https://patch.msgid.link/20241112083846.32063-1-MeiChia.Chiu@mediatek.com [pass elems/len instead of mgmt/len/is_req] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-11-12 13:41:45 +01:00
Mingwei Zheng	8251e7621b	net: rfkill: gpio: Add check for clk_enable() Add check for the return value of clk_enable() to catch the potential error. Fixes: 7176ba23f8b5 ("net: rfkill: add generic gpio rfkill driver") Signed-off-by: Mingwei Zheng <zmw12306@gmail.com> Signed-off-by: Jiasheng Jiang <jiashengjiangcool@gmail.com> Link: https://patch.msgid.link/20241108195341.1853080-1-zmw12306@gmail.com Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2024-11-12 13:30:31 +01:00
Jakub Kicinski	a58f00ed24	net: sched: cls_api: improve the error message for ID allocation failure We run into an exhaustion problem with the kernel-allocated filter IDs. Our allocation problem can be fixed on the user space side, but the error message in this case was quite misleading: "Filter with specified priority/protocol not found" (EINVAL) Specifically when we can't allocate a _new_ ID because filter with lowest ID already _exists_, saying "filter not found", is confusing. Kernel allocates IDs in range of 0xc0000 -> 0x8000, giving out ID one lower than lowest existing in that range. The error message makes sense when tcf_chain_tp_find() gets called for GET and DEL but for NEW we need to provide more specific error messages for all three cases: - user wants the ID to be auto-allocated but filter with ID 0x8000 already exists - filter already exists and can be replaced, but user asked for a protocol change - filter doesn't exist Caller of tcf_chain_tp_insert_unique() doesn't set extack today, so don't bother plumbing it in. Reviewed-by: Simon Horman <horms@kernel.org> Signed-off-by: Jakub Kicinski <kuba@kernel.org> Acked-by: Jamal Hadi Salim <jhs@mojatatu.com> Link: https://patch.msgid.link/20241108010254.2995438-1-kuba@kernel.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-11-12 12:58:31 +01:00
Paolo Abeni	20bbe5b802	Merge branch 'virtio-vsock-fix-memory-leaks' Michal Luczaj says: ==================== virtio/vsock: Fix memory leaks Short series fixing some memory leaks that I've stumbled upon while toying with the selftests. Signed-off-by: Michal Luczaj <mhal@rbox.co> ==================== Link: https://patch.msgid.link/20241107-vsock-mem-leaks-v2-0-4e21bfcfc818@rbox.co Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-11-12 12:16:55 +01:00
Michal Luczaj	60cf6206a1	virtio/vsock: Improve MSG_ZEROCOPY error handling Add a missing kfree_skb() to prevent memory leaks. Fixes: 581512a6dc93 ("vsock/virtio: MSG_ZEROCOPY flag support") Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Acked-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-11-12 12:16:51 +01:00
Michal Luczaj	fbf7085b3a	vsock: Fix sk_error_queue memory leak Kernel queues MSG_ZEROCOPY completion notifications on the error queue. Where they remain, until explicitly recv()ed. To prevent memory leaks, clean up the queue when the socket is destroyed. unreferenced object 0xffff8881028beb00 (size 224): comm "vsock_test", pid 1218, jiffies 4294694897 hex dump (first 32 bytes): 90 b0 21 17 81 88 ff ff 90 b0 21 17 81 88 ff ff ..!.......!..... 00 00 00 00 00 00 00 00 00 b0 21 17 81 88 ff ff ..........!..... backtrace (crc 6c7031ca): [<ffffffff81418ef7>] kmem_cache_alloc_node_noprof+0x2f7/0x370 [<ffffffff81d35882>] __alloc_skb+0x132/0x180 [<ffffffff81d2d32b>] sock_omalloc+0x4b/0x80 [<ffffffff81d3a8ae>] msg_zerocopy_realloc+0x9e/0x240 [<ffffffff81fe5cb2>] virtio_transport_send_pkt_info+0x412/0x4c0 [<ffffffff81fe6183>] virtio_transport_stream_enqueue+0x43/0x50 [<ffffffff81fe0813>] vsock_connectible_sendmsg+0x373/0x450 [<ffffffff81d233d5>] ____sys_sendmsg+0x365/0x3a0 [<ffffffff81d246f4>] ___sys_sendmsg+0x84/0xd0 [<ffffffff81d26f47>] __sys_sendmsg+0x47/0x80 [<ffffffff820d3df3>] do_syscall_64+0x93/0x180 [<ffffffff8220012b>] entry_SYSCALL_64_after_hwframe+0x76/0x7e Fixes: 581512a6dc93 ("vsock/virtio: MSG_ZEROCOPY flag support") Signed-off-by: Michal Luczaj <mhal@rbox.co> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Acked-by: Arseniy Krasnov <avkrasnov@salutedevices.com> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-11-12 12:16:51 +01:00
Michal Luczaj	d7b0ff5a86	virtio/vsock: Fix accept_queue memory leak As the final stages of socket destruction may be delayed, it is possible that virtio_transport_recv_listen() will be called after the accept_queue has been flushed, but before the SOCK_DONE flag has been set. As a result, sockets enqueued after the flush would remain unremoved, leading to a memory leak. vsock_release __vsock_release lock virtio_transport_release virtio_transport_close schedule_delayed_work(close_work) sk_shutdown = SHUTDOWN_MASK (!) flush accept_queue release virtio_transport_recv_pkt vsock_find_bound_socket lock if flag(SOCK_DONE) return virtio_transport_recv_listen child = vsock_create_connected (!) vsock_enqueue_accept(child) release close_work lock virtio_transport_do_close set_flag(SOCK_DONE) virtio_transport_remove_sock vsock_remove_sock vsock_remove_bound release Introduce a sk_shutdown check to disallow vsock_enqueue_accept() during socket destruction. unreferenced object 0xffff888109e3f800 (size 2040): comm "kworker/5:2", pid 371, jiffies 4294940105 hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 28 00 0b 40 00 00 00 00 00 00 00 00 00 00 00 00 (..@............ backtrace (crc 9e5f4e84): [<ffffffff81418ff1>] kmem_cache_alloc_noprof+0x2c1/0x360 [<ffffffff81d27aa0>] sk_prot_alloc+0x30/0x120 [<ffffffff81d2b54c>] sk_alloc+0x2c/0x4b0 [<ffffffff81fe049a>] __vsock_create.constprop.0+0x2a/0x310 [<ffffffff81fe6d6c>] virtio_transport_recv_pkt+0x4dc/0x9a0 [<ffffffff81fe745d>] vsock_loopback_work+0xfd/0x140 [<ffffffff810fc6ac>] process_one_work+0x20c/0x570 [<ffffffff810fce3f>] worker_thread+0x1bf/0x3a0 [<ffffffff811070dd>] kthread+0xdd/0x110 [<ffffffff81044fdd>] ret_from_fork+0x2d/0x50 [<ffffffff8100785a>] ret_from_fork_asm+0x1a/0x30 Fixes: 3fe356d58efa ("vsock/virtio: discard packets only when socket is really closed") Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Michal Luczaj <mhal@rbox.co> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-11-12 12:16:51 +01:00
Breno Leitao	12079a59ce	net: Implement fault injection forcing skb reallocation Introduce a fault injection mechanism to force skb reallocation. The primary goal is to catch bugs related to pointer invalidation after potential skb reallocation. The fault injection mechanism aims to identify scenarios where callers retain pointers to various headers in the skb but fail to reload these pointers after calling a function that may reallocate the data. This type of bug can lead to memory corruption or crashes if the old, now-invalid pointers are used. By forcing reallocation through fault injection, we can stress-test code paths and ensure proper pointer management after potential skb reallocations. Add a hook for fault injection in the following functions: * pskb_trim_rcsum() * pskb_may_pull_reason() * pskb_trim() As the other fault injection mechanism, protect it under a debug Kconfig called CONFIG_FAIL_SKB_REALLOC. This patch was heavily inspired by Jakub's proposal from: https://lore.kernel.org/all/20240719174140.47a868e6@kernel.org/ CC: Akinobu Mita <akinobu.mita@gmail.com> Suggested-by: Jakub Kicinski <kuba@kernel.org> Signed-off-by: Breno Leitao <leitao@debian.org> Reviewed-by: Akinobu Mita <akinobu.mita@gmail.com> Acked-by: Paolo Abeni <pabeni@redhat.com> Acked-by: Guillaume Nault <gnault@redhat.com> Link: https://patch.msgid.link/20241107-fault_v6-v6-1-1b82cb6ecacd@debian.org Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-11-12 12:05:33 +01:00
Paolo Abeni	12f077a728	Merge branch 'net-ip-add-drop-reasons-to-input-route' Menglong Dong says: ==================== net: ip: add drop reasons to input route In this series, we mainly add some skb drop reasons to the input path of ip routing, and we make the following functions return drop reasons: fib_validate_source() ip_route_input_mc() ip_mc_validate_source() ip_route_input_slow() ip_route_input_rcu() ip_route_input_noref() ip_route_input() ip_mkroute_input() __mkroute_input() ip_route_use_hint() And following new skb drop reasons are added: SKB_DROP_REASON_IP_LOCAL_SOURCE SKB_DROP_REASON_IP_INVALID_SOURCE SKB_DROP_REASON_IP_LOCALNET SKB_DROP_REASON_IP_INVALID_DEST Changes since v4: - in the 6th patch: remove the unneeded "else" in ip_expire() - in the 8th patch: delete the unneeded comment in __mkroute_input() - in the 9th patch: replace "return 0" with "return SKB_NOT_DROPPED_YET" in ip_route_use_hint() Changes since v3: - don't refactor fib_validate_source/__fib_validate_source, and introduce a wrapper for fib_validate_source() instead in the 1st patch. - some small adjustment in the 4-7 patches Changes since v2: - refactor fib_validate_source and __fib_validate_source to make fib_validate_source return drop reasons - add the 9th and 10th patches to make this series cover the input route code path Changes since v1: - make ip_route_input_noref/ip_route_input_rcu/ip_route_input_slow return drop reasons, instead of passing a local variable to their function arguments. ==================== Link: https://patch.msgid.link/20241107125601.1076814-1-dongml2@chinatelecom.cn Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-11-12 11:24:53 +01:00
Menglong Dong	479aed04e8	net: ip: make ip_route_use_hint() return drop reasons In this commit, we make ip_route_use_hint() return drop reasons. The drop reasons that we return are similar to what we do in ip_route_input_slow(), and no drop reasons are added in this commit. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-11-12 11:24:51 +01:00
Menglong Dong	d9340d1e02	net: ip: make ip_mkroute_input/__mkroute_input return drop reasons In this commit, we make ip_mkroute_input() and __mkroute_input() return drop reasons. The drop reason "SKB_DROP_REASON_ARP_PVLAN_DISABLE" is introduced for the case: the packet which is not IP is forwarded to the in_dev, and the proxy_arp_pvlan is not enabled. Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn> Signed-off-by: Paolo Abeni <pabeni@redhat.com>	2024-11-12 11:24:51 +01:00

... 3 4 5 6 7 ...

1312734 Commits