o fixes for inode garbage collection shutdown racing with work queue
updates
o ensure inodegc workers run on the CPU they are supposed to
o disable counter scrubbing until we can exclusively freeze the
filesystem from the kernel
o Regression fixes for new allocation related bugs
o a couple of minor cleanups
-----BEGIN PGP SIGNATURE-----
iQJIBAABCgAyFiEEmJOoJ8GffZYWSjj/regpR/R1+h0FAmRcSIsUHGRhdmlkQGZy
b21vcmJpdC5jb20ACgkQregpR/R1+h2Y8xAAxtsTdOx71XtDuNyfBOiqzZgTCq6b
6LsckJIDQa1AXjUNq9G3zWcUcWBcRWcw+CWbkqjqQ9W47K/ijLuoKnjRsQ+5B4DU
TBUctVq+/Zk2lBlb6HKuKdzqDGnIFWGVKVd7u8KlowqnXuzUeQ0vFkT7ZHTepUKG
P+midgGNVT4+tykq7oH0H8WxoTyNPZhKiAUcZjneBgA60IAoQWHA2iUt+SKpbrkL
1HyK+/edVMTXiDXtyHfXmDaH9Pgy6NCpw3TNkPDhuL1UDpLhg/zgT39rFZGBsAUt
gaDM3wN5jBrot/mvJE3rH9bdZhkcf+NQKPx/1DDg3DL8plS/1/LUC4cImdolBJ3w
RNmgJv1lK+AlE4MUJ/bUDlEpHUmwAjnnsxBXwEvnYNfj+9V6/mDB+HqKiY7/XxVK
vF77s6z+CWvefdnZavJ4/72pVVJNkcDYCYmvh/donRP6vtnwZyzocFUeBeNMInV1
/s3WMrF9hwmJqAClKG7p1fnszWp658yFIuw/TXVs+NrjTtQgXwMpl2cEYYvUZEJN
Trq2p0xH/JSwcnOPSPJO6WHb8UPoqrM6lgGFaJVWJx1AWt1i1CFLf5eA5X+XisDV
AJKgpqlnDg02bBMQ0tMFGZUaNx/1S1mwtxcZsyEFTutpUNxqJKDaMohpxxrWb0WC
ppSqDvyJN4wtlFI=
=qok2
-----END PGP SIGNATURE-----
Merge tag 'xfs-6.4-rc1-fixes' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux
Pull xfs bug fixes from Dave Chinner:
"Largely minor bug fixes and cleanups, th emost important of which are
probably the fixes for regressions in the extent allocation code:
- fixes for inode garbage collection shutdown racing with work queue
updates
- ensure inodegc workers run on the CPU they are supposed to
- disable counter scrubbing until we can exclusively freeze the
filesystem from the kernel
- regression fixes for new allocation related bugs
- a couple of minor cleanups"
* tag 'xfs-6.4-rc1-fixes' of git://git.kernel.org/pub/scm/fs/xfs/xfs-linux:
xfs: fix xfs_inodegc_stop racing with mod_delayed_work
xfs: disable reaping in fscounters scrub
xfs: check that per-cpu inodegc workers actually run on that cpu
xfs: explicitly specify cpu when forcing inodegc delayed work to run immediately
xfs: fix negative array access in xfs_getbmap
xfs: don't allocate into the data fork for an unshare request
xfs: flush dirty data and drain directios before scrubbing cow fork
xfs: set bnobt/cntbt numrecs correctly when formatting new AGs
xfs: don't unconditionally null args->pag in xfs_bmap_btalloc_at_eof
Few fixes for Devicetree bindings and related docs, all for issues
introduced in v6.4-rc1 commits:
1. media/ov2685: fix number of possible data lanes, as old binding
explicitly mentioned one data lane. This fixes dt_binding_check
warnings like:
Documentation/devicetree/bindings/media/rockchip-isp1.example.dtb: camera@3c: port:endpoint:data-lanes: [[1]] is too short
From schema: Documentation/devicetree/bindings/media/i2c/ovti,ov2685.yaml
2. Maintainers: correct path of Apple PWM binding. This fixes
refcheckdocs warning.
3. PCI/fsl,imx6q: correct parsing of assigned-clocks and related
properties and make the clocks more specific per PCI device (host or
endpoint). This fixes dtschema limitation and dt_binding_check
warnings like:
Documentation/devicetree/bindings/pci/fsl,imx6q-pcie-ep.example.dtb: pcie-ep@33800000: Unevaluated properties are not allowed
From schema: Documentation/devicetree/bindings/pci/fsl,imx6q-pcie-ep.yaml
-----BEGIN PGP SIGNATURE-----
iQJEBAABCgAuFiEE3dJiKD0RGyM7briowTdm5oaLg9cFAmRcxQIQHGtyemtAa2Vy
bmVsLm9yZwAKCRDBN2bmhouD158qD/0bBKNr0Ydj9HsobdAsGnkIUIpGgl2TktxB
tvOgbb4ppD3I4HAbh+AobXsB3zILnC1nv9Vky7jRxty9CquDotzRsedBhfih00fK
2lU5eClF1tX2dFNU+yHeLmdQxgocxfGvpvO8GZWpL5f/sZWYsFkuBiOedrFeqbLZ
6Sj231lW71rJj884HoeA/vLEQJXf4peHTGIfw5xQuSGninaForendT7fsZh9Ij4d
MdNEnA3irxypDWbuHiEVKwv/Fbgp9RTomgi9+MfaecmcH1Z0zTbRn/AzGYiehIgw
IeJA25HkOaiHNARyczQI2PzSPuFslFVT8vc7CtAgaX13NcJ01gbQDWm00W8tP5GX
hih3ky50GImarwhSFizmv5pxsGXc9FaPAC5WMgXMADqFP2equzYngKlDcXBnKkiw
hOByMJs9aGe2/fksExeV4OS3oBDLo3KTzvJ5biWNkcCaHxvNQT2fNPOMYuDxew1S
qvFgcxc4b/DkH73yO7Jp0wOAbUwAfSxD6CX06AyKl64OOZ852mtS8R51G0+xHQcn
bJ1uWgf9+TSlSZAUhdmcbkLRNTk+zM6T11hWoNB+hIr5FN4/KxteEsE5l0YvLMzo
T7cTrrXBee1MuU33x6GbLybsewBZMOAnXm64sqpua6DyfuBY1kBUjzL9mTc6PaB/
C4C6fY1Bew==
=iuNU
-----END PGP SIGNATURE-----
Merge tag 'dt-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-dt
Pull devicetree binding fixes from Krzysztof Kozlowski:
"A few fixes for Devicetree bindings and related docs, all for issues
introduced in v6.4-rc1 commits:
- media/ov2685: fix number of possible data lanes, as old binding
explicitly mentioned one data lane. This fixes dt_binding_check
warnings like:
Documentation/devicetree/bindings/media/rockchip-isp1.example.dtb: camera@3c: port:endpoint:data-lanes: [[1]] is too short
From schema: Documentation/devicetree/bindings/media/i2c/ovti,ov2685.yaml
- PCI/fsl,imx6q: correct parsing of assigned-clocks and related
properties and make the clocks more specific per PCI device (host
or endpoint). This fixes dtschema limitation and dt_binding_check
warnings like:
Documentation/devicetree/bindings/pci/fsl,imx6q-pcie-ep.example.dtb: pcie-ep@33800000: Unevaluated properties are not allowed
From schema: Documentation/devicetree/bindings/pci/fsl,imx6q-pcie-ep.yaml
- Maintainers: correct path of Apple PWM binding. This fixes
refcheckdocs warning"
* tag 'dt-fixes-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/krzk/linux-dt:
dt-bindings: PCI: fsl,imx6q: fix assigned-clocks warning
MAINTAINERS: adjust file entry for ARM/APPLE MACHINE SUPPORT
media: dt-bindings: ov2685: Correct data-lanes attribute
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmRbUnQACgkQ1V2XiooU
IOQNVw//eoCiid3J4TuNdzHaBHXUlZvln1n5Z6K5fIz5ytrY1T7oKC8Y9StkzzWR
29ToFLOJt1iRAcgxsghWRIwUzNwpuUdqgd9cUZMHMQxT0BJItp6FXUql2+1LkF/I
b/gnnb90zyE7lBS/VSRyOiqMiJlP+Som22d7Nn5k2KfTYEdXKwfzjsWAu3W3Sb0s
Lv/MA9DE42qcwiZubmFmDtOtAunPJFZm3HgkcAVeXoNkBDrSfkvxLIMYG6VfFNhQ
AkKMyzX293wpwVxfOuQfJr4QVlxAgOQUko+FqajoWMBtfA3yldZjJ8RC7c9Af9uI
ciOP11vHBCG84KrTabC5kdqOcvadreDiM/oIvk57ztQhCr3e+po+vIz6Cv4p9I30
m5GXfgbtMRl6hM2S5lrRc5fNRkYJHE4aNvesFTGaLpK3LogusH1E9mH5jdjnbU42
TwkIe250qJJelNn9ZxS5Jt0BgyogNfeiA9lOXmaQBpYmwIahrjDf0g8z2I85nPhF
PDukjSXsxi4uHwpSF5wFrlqkAPiEX4vC95uSUTVbbBgZrHhNJdGgu4FJSHRM4mVo
6Awxk2O2bvcDpXEfTBdD4EJF71bZ/aH2i8ddU1oupIl88O01TWTtkO4SYHqMX+tC
fPnpGzBN8KJvHgeFxO4p0v1oelv2RtiITv1gi7YJOnq/+Vd2yy0=
=HmfP
-----END PGP SIGNATURE-----
Merge tag 'nf-23-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter updates for net
The following patchset contains Netfilter fixes for net:
1) Fix UAF when releasing netnamespace, from Florian Westphal.
2) Fix possible BUG_ON when nf_conntrack is enabled with enable_hooks,
from Florian Westphal.
3) Fixes for nft_flowtable.sh selftest, from Boris Sukholitko.
4) Extend nft_flowtable.sh selftest to cover integration with
ingress/egress hooks, from Florian Westphal.
* tag 'nf-23-05-10' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
selftests: nft_flowtable.sh: check ingress/egress chain too
selftests: nft_flowtable.sh: monitor result file sizes
selftests: nft_flowtable.sh: wait for specific nc pids
selftests: nft_flowtable.sh: no need for ps -x option
selftests: nft_flowtable.sh: use /proc for pid checking
netfilter: conntrack: fix possible bug_on with enable_hooks=1
netfilter: nf_tables: always release netdev hooks from notifier
====================
Link: https://lore.kernel.org/r/20230510083313.152961-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kuniyuki Iwashima says:
====================
af_unix: Fix two data races reported by KCSAN.
KCSAN reported data races around these two fields for AF_UNIX sockets.
* sk->sk_receive_queue->qlen
* sk->sk_shutdown
Let's annotate them properly.
====================
Link: https://lore.kernel.org/r/20230510003456.42357-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It's been a few years since we've sorted this thing, and the end result
is that we've added MAINTAINERS entries in the wrong order, and a number
of entries have their fields in non-canonical order too.
So roll this boulder up the hill one more time by re-running
./scripts/parse-maintainers.pl --order
on it.
This file ends up being fairly painful for merge conflicts even
normally, since unlike almost all other kernel files it's one of those
"everybody touches the same thing", and re-ordering all entries is only
going to make that worse. But the alternative is to never do it at all,
and just let it all rot..
The rc2 week is likely the quietest and least painful time to do this.
Requested-by: Randy Dunlap <rdunlap@infradead.org>
Requested-by: Joe Perches <joe@perches.com> # "Please use --order"
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQEzBAABCAAdFiEEq1nRK9aeMoq1VSgcnJ2qBz9kQNkFAmRcFLAACgkQnJ2qBz9k
QNmSuQf+P+hFOHVpCb83S4TrNsfPC9M7Pv/M1Rw926p1gESNLez1ElAx2B35OIBZ
y4ijqrSHgXbCsC3PvbxR9YfBPJZwIhtmKwjPC+9KlhAFB536Bvt/nt34fjRCWJiS
nABnexegkirk7EQufAGVZQ1gtm1p0qSCA1B/l5AFl4KeLIin31U/08F2eE1OAo3K
TmxhXzlC+JRXjf/X/X8mzJ8nz/hY5039BDvs7EpVdOsZcLFRQcCUygUCrOdV94uI
1wyzMoGNaiIsZHpHPf3NtphUoE9IFpZQJGXMTR4ehBgFYkkOHYxsjW+wn8YOmaks
qzRRXb9w0WdgAXChDGu1xYXV6Wpjiw==
=kqSe
-----END PGP SIGNATURE-----
Merge tag 'fsnotify_for_v6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs
Pull inotify fix from Jan Kara:
"A fix for possibly reporting invalid watch descriptor with inotify
event"
* tag 'fsnotify_for_v6.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs:
inotify: Avoid reporting event with invalid wd
On corrupt gfs2 file systems the evict code can try to reference the
journal descriptor structure, jdesc, after it has been freed and set to
NULL. The sequence of events is:
init_journal()
...
fail_jindex:
gfs2_jindex_free(sdp); <------frees journals, sets jdesc = NULL
if (gfs2_holder_initialized(&ji_gh))
gfs2_glock_dq_uninit(&ji_gh);
fail:
iput(sdp->sd_jindex); <--references jdesc in evict_linked_inode
evict()
gfs2_evict_inode()
evict_linked_inode()
ret = gfs2_trans_begin(sdp, 0, sdp->sd_jdesc->jd_blocks);
<------references the now freed/zeroed sd_jdesc pointer.
The call to gfs2_trans_begin is done because the truncate_inode_pages
call can cause gfs2 events that require a transaction, such as removing
journaled data (jdata) blocks from the journal.
This patch fixes the problem by adding a check for sdp->sd_jdesc to
function gfs2_evict_inode. In theory, this should only happen to corrupt
gfs2 file systems, when gfs2 detects the problem, reports it, then tries
to evict all the system inodes it has read in up to that point.
Reported-by: Yang Lan <lanyang0908@gmail.com>
Signed-off-by: Bob Peterson <rpeterso@redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Highlights:
- thinkpad_acpi: Fix profile (performance/bal/low-power) regression on T490
- misc. other small fixes / hw-id additions
The following is an automated git shortlog grouped by driver:
hp-wmi:
- add micmute to hp_wmi_keymap struct
intel_scu_pcidrv:
- Add back PCI ID for Medfield
platform/mellanox:
- fix potential race in mlxbf-tmfifo driver
platform/x86/intel-uncore-freq:
- Return error on write frequency
thinkpad_acpi:
- Add profile force ability
- Fix platform profiles on T490
touchscreen_dmi:
- Add info for the Dexp Ursus KX210i
- Add upside-down quirk for GDIX1002 ts on the Juno Tablet
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEEuvA7XScYQRpenhd+kuxHeUQDJ9wFAmRbcQ4UHGhkZWdvZWRl
QHJlZGhhdC5jb20ACgkQkuxHeUQDJ9yFnQf9G83wo2B5SfjN7V7viYHEGB4InfQL
+MbhwP+Efe++BkxwAzVMcVLSuYOsbqcrRBCiQkBVCyAF14LVVH3ymnn/upoqz9l4
K4JUZorGIEPgl0SpGP+41j+P93XYfcfMYPRetZA8ErBpitSfHVf9oAG8vTgaemda
Qxo8DWs4R1WGdg8KZ7icZ7shXiyzIeGBnQB/5EzpBrv6v6EmiJsUZh5ZZASOidrg
IXTp8LgLhkFbIQBgHoOyHoM7puRI7MRbxu5YEJ+wKumFi4KmuTET5iDlZFpmqvoH
eAU/JUWIsQ3I1QwTKFSLFYkJZCZrIIJHew1Lvs9F+/x1c2vs1QjOG8RoEg==
=SEuW
-----END PGP SIGNATURE-----
Merge tag 'platform-drivers-x86-v6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86
Pull x86 platform driver fixes from Hans de Goede:
"Nothing special to report just various small fixes:
- thinkpad_acpi: Fix profile (performance/bal/low-power) regression
on T490
- misc other small fixes / hw-id additions"
* tag 'platform-drivers-x86-v6.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/pdx86/platform-drivers-x86:
platform/mellanox: fix potential race in mlxbf-tmfifo driver
platform/x86: touchscreen_dmi: Add info for the Dexp Ursus KX210i
platform/x86: touchscreen_dmi: Add upside-down quirk for GDIX1002 ts on the Juno Tablet
platform/x86: thinkpad_acpi: Add profile force ability
platform/x86: thinkpad_acpi: Fix platform profiles on T490
platform/x86: hp-wmi: add micmute to hp_wmi_keymap struct
platform/x86/intel-uncore-freq: Return error on write frequency
platform/x86: intel_scu_pcidrv: Add back PCI ID for Medfield
Commit d4c3676507 ("net: mscc: ocelot: keep ocelot_stat_layout by reg
address, not offset") organized the stats counters for Ocelot chips, namely
the VSC7512 and VSC7514. A few of the counter offsets were incorrect, and
were caught by this warning:
WARNING: CPU: 0 PID: 24 at drivers/net/ethernet/mscc/ocelot_stats.c:909
ocelot_stats_init+0x1fc/0x2d8
reg 0x5000078 had address 0x220 but reg 0x5000079 has address 0x214,
bulking broken!
Fix these register offsets.
Fixes: d4c3676507 ("net: mscc: ocelot: keep ocelot_stat_layout by reg address, not offset")
Signed-off-by: Colin Foster <colin.foster@in-advantage.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Fix the chapter heading for "X.25 Device Driver Interface" so that it
does not contain a trailing '-' character, which makes Sphinx
omit this heading from the contents.
Reverse the order of the x25.rst and x25-iface.rst files in the index
so that the project introduction (x25.rst) comes first.
Fixes: 883780af72 ("docs: networking: convert x25-iface.txt to ReST")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Dumazet <edumazet@google.com>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Paolo Abeni <pabeni@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: linux-doc@vger.kernel.org
Cc: Martin Schiller <ms@dev.tdt.de>
Cc: linux-x25@vger.kernel.org
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Clearing the PBA bit from the driver is race prone and it may lead to
dropped interrupt events. This could potentially lead to the traffic
being completely halted.
Fixes: 5e8c5adf95 ("gve: DQO: Add core netdev features")
Signed-off-by: Ziwei Xiao <ziweixiao@google.com>
Signed-off-by: Bailey Forrest <bcf@google.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
In synopsys_xpcs_compat[], the DW_XPCS_2500BASEX entry was setting
the number of interfaces using the xpcs_2500basex_features array
rather than xpcs_2500basex_interfaces. This causes us to overflow
the array of interfaces. Fix this.
Fixes: f27abde304 ("net: pcs: add 2500BASEX support for Intel mGbE controller")
Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
__condition is evaluated twice in sk_wait_event() macro.
First invocation is lockless, and reads can race with writes,
as spotted by syzbot.
BUG: KCSAN: data-race in sk_stream_wait_connect / tcp_disconnect
write to 0xffff88812d83d6a0 of 4 bytes by task 9065 on cpu 1:
tcp_disconnect+0x2cd/0xdb0
inet_shutdown+0x19e/0x1f0 net/ipv4/af_inet.c:911
__sys_shutdown_sock net/socket.c:2343 [inline]
__sys_shutdown net/socket.c:2355 [inline]
__do_sys_shutdown net/socket.c:2363 [inline]
__se_sys_shutdown+0xf8/0x140 net/socket.c:2361
__x64_sys_shutdown+0x31/0x40 net/socket.c:2361
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
read to 0xffff88812d83d6a0 of 4 bytes by task 9040 on cpu 0:
sk_stream_wait_connect+0x1de/0x3a0 net/core/stream.c:75
tcp_sendmsg_locked+0x2e4/0x2120 net/ipv4/tcp.c:1266
tcp_sendmsg+0x30/0x50 net/ipv4/tcp.c:1484
inet6_sendmsg+0x63/0x80 net/ipv6/af_inet6.c:651
sock_sendmsg_nosec net/socket.c:724 [inline]
sock_sendmsg net/socket.c:747 [inline]
__sys_sendto+0x246/0x300 net/socket.c:2142
__do_sys_sendto net/socket.c:2154 [inline]
__se_sys_sendto net/socket.c:2150 [inline]
__x64_sys_sendto+0x78/0x90 net/socket.c:2150
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
value changed: 0x00000000 -> 0x00000068
Fixes: 1da177e4c3 ("Linux-2.6.12-rc2")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
do_recvmmsg() can write to sk->sk_err from multiple threads.
As said before, many other points reading or writing sk_err
need annotations.
Fixes: 34b88a68f2 ("net: Fix use after free in the recvmmsg exit path")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reported-by: syzbot <syzkaller@googlegroups.com>
Reviewed-by: Kuniyuki Iwashima <kuniyu@amazon.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Hangbin Liu says:
====================
bonding: fix send_peer_notif overflow
Bonding send_peer_notif was defined as u8. But the value is
num_peer_notif multiplied by peer_notif_delay, which is u8 * u32.
This would cause the send_peer_notif overflow.
Before the fix:
TEST: num_grat_arp (active-backup miimon num_grat_arp 10) [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 20) [ OK ]
4 garp packets sent on active slave eth1
TEST: num_grat_arp (active-backup miimon num_grat_arp 30) [FAIL]
24 garp packets sent on active slave eth1
TEST: num_grat_arp (active-backup miimon num_grat_arp 50) [FAIL]
After the fix:
TEST: num_grat_arp (active-backup miimon num_grat_arp 10) [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 20) [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 30) [ OK ]
TEST: num_grat_arp (active-backup miimon num_grat_arp 50) [ OK ]
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
When run the test in netns, it's not easy to get the tc stats via
tc_rule_handle_stats_get(). With the new netns parameter, we can get
stats from specific netns like
num=$(tc_rule_handle_stats_get "dev eth0 ingress" 101 ".packets" "-n ns")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bonding only supports setting peer_notif_delay with miimon set.
Fixes: 0307d589c4 ("bonding: add documentation for peer_notif_delay")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Bonding send_peer_notif was defined as u8. Since commit 07a4ddec3c
("bonding: add an option to specify a delay between peer notifications").
the bond->send_peer_notif will be num_peer_notif multiplied by
peer_notif_delay, which is u8 * u32. This would cause the send_peer_notif
overflow easily. e.g.
ip link add bond0 type bond mode 1 miimon 100 num_grat_arp 30 peer_notify_delay 1000
To fix the overflow, let's set the send_peer_notif to u32 and limit
peer_notif_delay to 300s.
Reported-by: Liang Li <liali@redhat.com>
Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2090053
Fixes: 07a4ddec3c ("bonding: add an option to specify a delay between peer notifications")
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Check for NULL pointer to avoid kernel crashing in case of missing WO
firmware in case only a single WEDv2 device has been initialized, e.g. on
MT7981 which can connect just one wireless frontend.
Fixes: 86ce0d09e4 ("net: ethernet: mtk_eth_soc: use WO firmware for MT7981")
Signed-off-by: Daniel Golle <daniel@makrotopia.org>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Make sure flowtable interacts correctly with ingress and egress
chains, i.e. those get handled before and after flow table respectively.
Adds three more tests:
1. repeat flowtable test, but with 'ip dscp set cs3' done in
inet forward chain.
Expect that some packets have been mangled (before flowtable offload
became effective) while some pass without mangling (after offload
succeeds).
2. repeat flowtable test, but with 'ip dscp set cs3' done in
veth0:ingress.
Expect that all packets pass with cs3 dscp field.
3. same as 2, but use veth1:egress. Expect the same outcome.
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When running nft_flowtable.sh in VM on a busy server we've found that
the time of the netcat file transfers vary wildly.
Therefore replace hardcoded 3 second sleep with the loop checking for
a change in the file sizes. Once no change in detected we test the results.
Nice side effect is that we shave 1 second sleep in the fast case
(hard-coded 3 second sleep vs two 1 second sleeps).
Acked-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Doing wait with no parameters may interfere with some of the tests
having their own background processes.
Although no such test is currently present, the cleanup is useful
to rely on the nft_flowtable.sh for local development (e.g. running
background tcpdump command during the tests).
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Some ps commands (e.g. busybox derived) have no -x option. For the
purposes of hash calculation of the list of processes this option is
inessential.
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Some ps commands (e.g. busybox derived) have no -p option. Use /proc for
pid existence check.
Signed-off-by: Boris Sukholitko <boris.sukholitko@broadcom.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
I received a bug report (no reproducer so far) where we trip over
712 rcu_read_lock();
713 ct_hook = rcu_dereference(nf_ct_hook);
714 BUG_ON(ct_hook == NULL); // here
In nf_conntrack_destroy().
First turn this BUG_ON into a WARN. I think it was triggered
via enable_hooks=1 flag.
When this flag is turned on, the conntrack hooks are registered
before nf_ct_hook pointer gets assigned.
This opens a short window where packets enter the conntrack machinery,
can have skb->_nfct set up and a subsequent kfree_skb might occur
before nf_ct_hook is set.
Call nf_conntrack_init_end() to set nf_ct_hook before we register the
pernet ops.
Fixes: ba3fbe6636 ("netfilter: nf_conntrack: provide modparam to always register conntrack hooks")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
This reverts "netfilter: nf_tables: skip netdev events generated on netns removal".
The problem is that when a veth device is released, the veth release
callback will also queue the peer netns device for removal.
Its possible that the peer netns is also slated for removal. In this
case, the device memory is already released before the pre_exit hook of
the peer netns runs:
BUG: KASAN: slab-use-after-free in nf_hook_entry_head+0x1b8/0x1d0
Read of size 8 at addr ffff88812c0124f0 by task kworker/u8:1/45
Workqueue: netns cleanup_net
Call Trace:
nf_hook_entry_head+0x1b8/0x1d0
__nf_unregister_net_hook+0x76/0x510
nft_netdev_unregister_hooks+0xa0/0x220
__nft_release_hook+0x184/0x490
nf_tables_pre_exit_net+0x12f/0x1b0
..
Order is:
1. First netns is released, veth_dellink() queues peer netns device
for removal
2. peer netns is queued for removal
3. peer netns device is released, unreg event is triggered
4. unreg event is ignored because netns is going down
5. pre_exit hook calls nft_netdev_unregister_hooks but device memory
might be free'd already.
Fixes: 68a3765c65 ("netfilter: nf_tables: skip netdev events generated on netns removal")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Since the driver works in the "legacy" addressing mode, we need to write
to the expansion register (0x17) with bits 11:8 set to 0xf to properly
select the expansion register passed as argument.
Fixes: f68d08c437 ("net: phy: bcm7xxx: Add EPHY entry for 72165")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20230508231749.1681169-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Initialize MAC_ONEUS_TIC_COUNTER register with correct value derived
from CSR clock, otherwise EEE is unstable on at least NXP i.MX8M Plus
and Micrel KSZ9131RNX PHY, to the point where not even ARP request can
be sent out.
i.MX 8M Plus Applications Processor Reference Manual, Rev. 1, 06/2021
11.7.6.1.34 One-microsecond Reference Timer (MAC_ONEUS_TIC_COUNTER)
defines this register as:
"
This register controls the generation of the Reference time (1 microsecond
tic) for all the LPI timers. This timer has to be programmed by the software
initially.
...
The application must program this counter so that the number of clock cycles
of CSR clock is 1us. (Subtract 1 from the value before programming).
For example if the CSR clock is 100MHz then this field needs to be programmed
to value 100 - 1 = 99 (which is 0x63).
This is required to generate the 1US events that are used to update some of
the EEE related counters.
"
The reset value is 0x63 on i.MX8M Plus, which means expected CSR clock are
100 MHz. However, the i.MX8M Plus "enet_qos_root_clk" are 266 MHz instead,
which means the LPI timers reach their count much sooner on this platform.
This is visible using a scope by monitoring e.g. exit from LPI mode on TX_CTL
line from MAC to PHY. This should take 30us per STMMAC_DEFAULT_TWT_LS setting,
during which the TX_CTL line transitions from tristate to low, and 30 us later
from low to high. On i.MX8M Plus, this transition takes 11 us, which matches
the 30us * 100/266 formula for misconfigured MAC_ONEUS_TIC_COUNTER register.
Configure MAC_ONEUS_TIC_COUNTER based on CSR clock, so that the LPI timers
have correct 1us reference. This then fixes EEE on i.MX8M Plus with Micrel
KSZ9131RNX PHY.
Fixes: 477286b53f ("stmmac: add GMAC4 core support")
Signed-off-by: Marek Vasut <marex@denx.de>
Tested-by: Harald Seiler <hws@denx.de>
Reviewed-by: Francesco Dolcini <francesco.dolcini@toradex.com>
Tested-by: Francesco Dolcini <francesco.dolcini@toradex.com> # Toradex Verdin iMX8MP
Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com>
Link: https://lore.kernel.org/r/20230506235845.246105-1-marex@denx.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Dan has been improving on the smatch error pointer checks, and pointed
at another case where the __filemap_get_folio() conversion to error
pointers had been overlooked. This time because it was hidden behind
the filemap_grab_folio() helper function that is a wrapper around it.
Reported-by: Dan Carpenter <dan.carpenter@linaro.org>
Cc: Anna Schumaker <anna@kernel.org>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEE8rQSAMVO+zA4DBdWxWXV+ddtWDsFAmRaYhYACgkQxWXV+ddt
WDvCRQ/+MjRuInALh+N34mMneF8jjPOlUQBZbaC43XYJ0ss9drSzvE3STmPVrjdK
IHyzRYipKI6vdTtzYbyGwxJ9oazsuXTQXC3w/qMW1hO1EAQ0a9tbnTSIQ+BDbU63
BW7rJ3JuM6hKxKK+e9Dserhks0lOgQc+xKT1CUELvAHp3UykD4OrNczguaIT2lGR
YXL+9B3ex2SooCqrQStkqEtjD/kxbaYUkK7yWA2FssXWqU5SjZwUOsuY3ZPOWrm1
ULNI67gIxkMkSynV3aYka7nY3xc9oGIfk9WPeylWcOcH3+pWabeptjk617XbA0KI
4biz1zZ/qTRXWlCLDv3ukUa5EIVAWQ1kxVE/hAt3SzqJvoqB/ymML/2LeQNdyx2i
adMTZQ95JkhQNU9Lp9QOtpgfZonhhjxnL9KE7eMVo28zJFdYjge3egINjimY+mLz
qzrzUBI3bqCNYG0LRR1EvuN0feBd/9nNMFjLBi2mkDqsWtzvTxxzWvVlV5EEcoJe
xrozGh00Y5ioP6ZanKuZRib+u2ligbD66dYhKSU74D6B5kuZPic3Kkn9qICjRByM
uBGBze/7GT/3ouhPOwxVPtGZstiFhbAxE7mApROrIxAx8I9rZjBdHgFJQklolXNy
HSKNf3u98XZBVVcku/O1hyoeTLnApPfApxD4lv3qlRmgdEnAp6I=
=K25T
-----END PGP SIGNATURE-----
Merge tag 'for-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux
Pull btrfs fixes from David Sterba:
- fix backward leaf iteration which could possibly return the same key
- fix assertion when device add and balance race for exclusive
operation
- fix regression when freeing device, state tree would leak after
device replace
- fix attempt to clear space cache v1 when block-group-tree is enabled
- fix potential i_size corruption when encoded write races with send v2
and enabled no-holes (the race is hard to hit though, the window is a
few instructions wide)
- fix wrong bitmap API use when checking empty zones, parameters were
swapped but not causing a bug due to other code
- prevent potential qgroup leak if subvolume create does not commit
transaction (which is pending in the development queue)
- error handling and reporting:
- abort transaction when sibling keys check fails for leaves
- print extent buffers when sibling keys check fails
* tag 'for-6.4-rc1-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux:
btrfs: don't free qgroup space unless specified
btrfs: fix encoded write i_size corruption with no-holes
btrfs: zoned: fix wrong use of bitops API in btrfs_ensure_empty_zones
btrfs: properly reject clear_cache and v1 cache for block-group-tree
btrfs: print extent buffers when sibling keys check fails
btrfs: abort transaction when sibling keys check fails for leaves
btrfs: fix leak of source device allocation state after device replace
btrfs: fix assertion of exclop condition when starting balance
btrfs: fix btrfs_prev_leaf() to not return the same key twice
This commit adds memory barrier for the 'vq' update in function
mlxbf_tmfifo_virtio_find_vqs() to avoid potential race due to
out-of-order memory write. It also adds barrier for the 'is_ready'
flag to make sure the initializations are visible before this flag
is checked.
Signed-off-by: Liming Sun <limings@nvidia.com>
Reviewed-by: Vadim Pasternak <vadimp@nvidia.com>
Link: https://lore.kernel.org/r/b98c0ab61d644ba38fa9b3fd1607b138b0dd820b.1682518748.git.limings@nvidia.com
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Add return value for dim_calc_stats. This is an indication for the
caller if curr_stats was assigned by the function. Avoid using
curr_stats uninitialized over {rdma/net}_dim, when no time delta between
samples. Coverity reported this potential use of an uninitialized
variable.
Fixes: 4c4dbb4a73 ("net/mlx5e: Move dynamic interrupt coalescing code to include/linux")
Fixes: cb3c7fd4f8 ("net/mlx5e: Support adaptive RX coalescing")
Signed-off-by: Roy Novich <royno@nvidia.com>
Reviewed-by: Aya Levin <ayal@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
Reviewed-by: Michal Kubiak <michal.kubiak@intel.com>
Link: https://lore.kernel.org/r/20230507135743.138993-1-tariqt@nvidia.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
There has been a lot of confusion around which platform profiles are
supported on various platforms and it would be useful to have a debug
method to be able to override the profile mode that is selected.
I don't expect this to be used in anything other than debugging in
conjunction with Lenovo engineers - but it does give a way to get a
system working whilst we wait for either FW fixes, or a driver fix
to land upstream, if something is wonky in the mode detection logic
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/20230505132523.214338-2-mpearson-lenovo@squebb.ca
Cc: stable@vger.kernel.org
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
I had incorrectly thought that PSC profiles were not usable on Intel
platforms so had blocked them in the driver initialistion. This broke
platform profiles on the T490.
After discussion with the FW team PSC does work on Intel platforms and
should be allowed.
Note - it's possible this may impact other platforms where it is advertised
but special driver support that only Windows has is needed. But if it does
then they will need fixing via quirks. Please report any issues to me so I
can get them addressed - but I haven't found any problems in testing...yet
Fixes: bce6243f76 ("platform/x86: thinkpad_acpi: do not use PSC mode on Intel platforms")
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2177962
Cc: stable@vger.kernel.org
Signed-off-by: Mark Pearson <mpearson-lenovo@squebb.ca>
Link: https://lore.kernel.org/r/20230505132523.214338-1-mpearson-lenovo@squebb.ca
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Currently when the uncore_write() returns error, it is silently
ignored. Return error to user space when uncore_write() fails.
Fixes: 49a474c7ba ("platform/x86: Add support for Uncore frequency control")
Signed-off-by: Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>
Reviewed-by: Zhang Rui <rui.zhang@intel.com>
Tested-by: Wendy Wang <wendy.wang@intel.com>
Reviewed-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Link: https://lore.kernel.org/r/20230418153230.679094-1-srinivas.pandruvada@linux.intel.com
Cc: stable@vger.kernel.org
Signed-off-by: Hans de Goede <hdegoede@redhat.com>