1240321 Commits

Author SHA1 Message Date
John Fastabend
dc9dfc8dc6 net: tls, fix WARNIING in __sk_msg_free
A splice with MSG_SPLICE_PAGES will cause tls code to use the
tls_sw_sendmsg_splice path in the TLS sendmsg code to move the user
provided pages from the msg into the msg_pl. This will loop over the
msg until msg_pl is full, checked by sk_msg_full(msg_pl). The user
can also set the MORE flag to hint stack to delay sending until receiving
more pages and ideally a full buffer.

If the user adds more pages to the msg than can fit in the msg_pl
scatterlist (MAX_MSG_FRAGS) we should ignore the MORE flag and send
the buffer anyways.

What actually happens though is we abort the msg to msg_pl scatterlist
setup and then because we forget to set 'full record' indicating we
can no longer consume data without a send we fallthrough to the 'continue'
path which will check if msg_data_left(msg) has more bytes to send and
then attempts to fit them in the already full msg_pl. Then next
iteration of sender doing send will encounter a full msg_pl and throw
the warning in the syzbot report.

To fix simply check if we have a full_record in splice code path and
if not send the msg regardless of MORE flag.

Reported-and-tested-by: syzbot+f2977222e0e95cec15c8@syzkaller.appspotmail.com
Reported-by: Edward Adam Davis <eadavis@qq.com>
Fixes: fe1e81d4f73b ("tls/sw: Support MSG_SPLICE_PAGES")
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: John Fastabend <john.fastabend@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-14 12:17:14 +00:00
Marc Kleine-Budde
894d750831 net: netdev_queue: netdev_txq_completed_mb(): fix wake condition
netif_txq_try_stop() uses "get_desc >= start_thrs" as the check for
the call to netif_tx_start_queue().

Use ">=" i netdev_txq_completed_mb(), too.

Fixes: c91c46de6bbc ("net: provide macros for commonly copied lockless queue stop/wake code")
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
Acked-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-13 18:26:23 +00:00
Eric Dumazet
9181d6f8a2 net: add more sanity check in virtio_net_hdr_to_skb()
syzbot/KMSAN reports access to uninitialized data from gso_features_check() [1]

The repro use af_packet, injecting a gso packet and hdrlen == 0.

We could fix the issue making gso_features_check() more careful
while dealing with NETIF_F_TSO_MANGLEID in fast path.

Or we can make sure virtio_net_hdr_to_skb() pulls minimal network and
transport headers as intended.

Note that for GSO packets coming from untrusted sources, SKB_GSO_DODGY
bit forces a proper header validation (and pull) before the packet can
hit any device ndo_start_xmit(), thus we do not need a precise disection
at virtio_net_hdr_to_skb() stage.

[1]
BUG: KMSAN: uninit-value in skb_gso_segment include/net/gso.h:83 [inline]
BUG: KMSAN: uninit-value in validate_xmit_skb+0x10f2/0x1930 net/core/dev.c:3629
 skb_gso_segment include/net/gso.h:83 [inline]
 validate_xmit_skb+0x10f2/0x1930 net/core/dev.c:3629
 __dev_queue_xmit+0x1eac/0x5130 net/core/dev.c:4341
 dev_queue_xmit include/linux/netdevice.h:3134 [inline]
 packet_xmit+0x9c/0x6b0 net/packet/af_packet.c:276
 packet_snd net/packet/af_packet.c:3087 [inline]
 packet_sendmsg+0x8b1d/0x9f30 net/packet/af_packet.c:3119
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg net/socket.c:745 [inline]
 ____sys_sendmsg+0x9c2/0xd60 net/socket.c:2584
 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
 __sys_sendmsg net/socket.c:2667 [inline]
 __do_sys_sendmsg net/socket.c:2676 [inline]
 __se_sys_sendmsg net/socket.c:2674 [inline]
 __x64_sys_sendmsg+0x307/0x490 net/socket.c:2674
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x63/0x6b

Uninit was created at:
 slab_post_alloc_hook+0x129/0xa70 mm/slab.h:768
 slab_alloc_node mm/slub.c:3478 [inline]
 kmem_cache_alloc_node+0x5e9/0xb10 mm/slub.c:3523
 kmalloc_reserve+0x13d/0x4a0 net/core/skbuff.c:560
 __alloc_skb+0x318/0x740 net/core/skbuff.c:651
 alloc_skb include/linux/skbuff.h:1286 [inline]
 alloc_skb_with_frags+0xc8/0xbd0 net/core/skbuff.c:6334
 sock_alloc_send_pskb+0xa80/0xbf0 net/core/sock.c:2780
 packet_alloc_skb net/packet/af_packet.c:2936 [inline]
 packet_snd net/packet/af_packet.c:3030 [inline]
 packet_sendmsg+0x70e8/0x9f30 net/packet/af_packet.c:3119
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg net/socket.c:745 [inline]
 ____sys_sendmsg+0x9c2/0xd60 net/socket.c:2584
 ___sys_sendmsg+0x28d/0x3c0 net/socket.c:2638
 __sys_sendmsg net/socket.c:2667 [inline]
 __do_sys_sendmsg net/socket.c:2676 [inline]
 __se_sys_sendmsg net/socket.c:2674 [inline]
 __x64_sys_sendmsg+0x307/0x490 net/socket.c:2674
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x63/0x6b

CPU: 0 PID: 5025 Comm: syz-executor279 Not tainted 6.7.0-rc7-syzkaller-00003-gfbafc3e621c3 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023

Reported-by: syzbot+7f4d0ea3df4d4fa9a65f@syzkaller.appspotmail.com
Link: https://lore.kernel.org/netdev/0000000000005abd7b060eb160cd@google.com/
Fixes: 9274124f023b ("net: stricter validation of untrusted gso packets")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Willem de Bruijn <willemb@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-13 18:06:23 +00:00
Jiri Pirko
e18405d0be net: sched: track device in tcf_block_get/put_ext() only for clsact binder types
Clsact/ingress qdisc is not the only one using shared block,
red is also using it. The device tracking was originally introduced
by commit 913b47d3424e ("net/sched: Introduce tc block netdev
tracking infra") for clsact/ingress only. Commit 94e2557d086a ("net:
sched: move block device tracking into tcf_block_get/put_ext()")
mistakenly enabled that for red as well.

Fix that by adding a check for the binder type being clsact when adding
device to the block->ports xarray.

Reported-by: Ido Schimmel <idosch@idosch.org>
Closes: https://lore.kernel.org/all/ZZ6JE0odnu1lLPtu@shredder/
Fixes: 94e2557d086a ("net: sched: move block device tracking into tcf_block_get/put_ext()")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Tested-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Tested-by: Victor Nogueira <victor@mojatatu.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-13 15:49:15 +00:00
Eric Dumazet
482521d8e0 udp: annotate data-races around up->pending
up->pending can be read without holding the socket lock,
as pointed out by syzbot [1]

Add READ_ONCE() in lockless contexts, and WRITE_ONCE()
on write side.

[1]
BUG: KCSAN: data-race in udpv6_sendmsg / udpv6_sendmsg

write to 0xffff88814e5eadf0 of 4 bytes by task 15547 on cpu 1:
 udpv6_sendmsg+0x1405/0x1530 net/ipv6/udp.c:1596
 inet6_sendmsg+0x63/0x80 net/ipv6/af_inet6.c:657
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg net/socket.c:745 [inline]
 __sys_sendto+0x257/0x310 net/socket.c:2192
 __do_sys_sendto net/socket.c:2204 [inline]
 __se_sys_sendto net/socket.c:2200 [inline]
 __x64_sys_sendto+0x78/0x90 net/socket.c:2200
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x63/0x6b

read to 0xffff88814e5eadf0 of 4 bytes by task 15551 on cpu 0:
 udpv6_sendmsg+0x22c/0x1530 net/ipv6/udp.c:1373
 inet6_sendmsg+0x63/0x80 net/ipv6/af_inet6.c:657
 sock_sendmsg_nosec net/socket.c:730 [inline]
 __sock_sendmsg net/socket.c:745 [inline]
 ____sys_sendmsg+0x37c/0x4d0 net/socket.c:2586
 ___sys_sendmsg net/socket.c:2640 [inline]
 __sys_sendmmsg+0x269/0x500 net/socket.c:2726
 __do_sys_sendmmsg net/socket.c:2755 [inline]
 __se_sys_sendmmsg net/socket.c:2752 [inline]
 __x64_sys_sendmmsg+0x57/0x60 net/socket.c:2752
 do_syscall_x64 arch/x86/entry/common.c:52 [inline]
 do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x63/0x6b

value changed: 0x00000000 -> 0x0000000a

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 15551 Comm: syz-executor.1 Tainted: G        W          6.7.0-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Reported-by: syzbot+8d482d0e407f665d9d10@syzkaller.appspotmail.com
Link: https://lore.kernel.org/netdev/0000000000009e46c3060ebcdffd@google.com/
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-13 15:46:20 +00:00
Sneh Shah
08300adac3 net: stmmac: Fix ethool link settings ops for integrated PCS
Currently get/set_link_ksettings ethtool ops are dependent on PCS.
When PCS is integrated, it will not have separate link config.
Bypass configuring and checking PCS for integrated PCS.

Fixes: aa571b6275fb ("net: stmmac: add new switch to struct plat_stmmacenet_data")
Tested-by: Andrew Halaney <ahalaney@redhat.com> # sa8775p-ride
Signed-off-by: Sneh Shah <quic_snehshah@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-13 12:41:50 +00:00
Jakub Kicinski
68d27badef Merge branch 'mptcp-better-validation-of-mptcpopt_mp_join-option'
Eric Dumazet says:

====================
mptcp: better validation of MPTCPOPT_MP_JOIN option

Based on a syzbot report (see 4th patch in the series).

We need to be more explicit about which one of the
following flag is set by mptcp_parse_option():

- OPTION_MPTCP_MPJ_SYN
- OPTION_MPTCP_MPJ_SYNACK
- OPTION_MPTCP_MPJ_ACK

Then select the appropriate values instead of OPTIONS_MPTCP_MPJ

Paolo suggested to do the same for OPTIONS_MPTCP_MPC (5th patch)
====================

Link: https://lore.kernel.org/r/20240111194917.4044654-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-12 18:14:24 -08:00
Eric Dumazet
724b00c129 mptcp: refine opt_mp_capable determination
OPTIONS_MPTCP_MPC is a combination of three flags.

It would be better to be strict about testing what
flag is expected, at least for code readability.

mptcp_parse_option() already makes the distinction.

- subflow_check_req() should use OPTION_MPTCP_MPC_SYN.

- mptcp_subflow_init_cookie_req() should use OPTION_MPTCP_MPC_ACK.

- subflow_finish_connect() should use OPTION_MPTCP_MPC_SYNACK

- subflow_syn_recv_sock should use OPTION_MPTCP_MPC_ACK

Suggested-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Fixes: 74c7dfbee3e1 ("mptcp: consolidate in_opt sub-options fields in a bitmask")
Link: https://lore.kernel.org/r/20240111194917.4044654-6-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-12 18:14:22 -08:00
Eric Dumazet
66ff70df1a mptcp: use OPTION_MPTCP_MPJ_SYN in subflow_check_req()
syzbot reported that subflow_check_req() was using uninitialized data in
subflow_check_req() [1]

This is because mp_opt.token is only set when OPTION_MPTCP_MPJ_SYN is also set.

While we are are it, fix mptcp_subflow_init_cookie_req()
to test for OPTION_MPTCP_MPJ_ACK.

[1]

BUG: KMSAN: uninit-value in subflow_token_join_request net/mptcp/subflow.c:91 [inline]
 BUG: KMSAN: uninit-value in subflow_check_req+0x1028/0x15d0 net/mptcp/subflow.c:209
  subflow_token_join_request net/mptcp/subflow.c:91 [inline]
  subflow_check_req+0x1028/0x15d0 net/mptcp/subflow.c:209
  subflow_v6_route_req+0x269/0x410 net/mptcp/subflow.c:367
  tcp_conn_request+0x153a/0x4240 net/ipv4/tcp_input.c:7164
 subflow_v6_conn_request+0x3ee/0x510
  tcp_rcv_state_process+0x2e1/0x4ac0 net/ipv4/tcp_input.c:6659
  tcp_v6_do_rcv+0x11bf/0x1fe0 net/ipv6/tcp_ipv6.c:1669
  tcp_v6_rcv+0x480b/0x4fb0 net/ipv6/tcp_ipv6.c:1900
  ip6_protocol_deliver_rcu+0xda6/0x2a60 net/ipv6/ip6_input.c:438
  ip6_input_finish net/ipv6/ip6_input.c:483 [inline]
  NF_HOOK include/linux/netfilter.h:314 [inline]
  ip6_input+0x15d/0x430 net/ipv6/ip6_input.c:492
  dst_input include/net/dst.h:461 [inline]
  ip6_rcv_finish+0x5db/0x870 net/ipv6/ip6_input.c:79
  NF_HOOK include/linux/netfilter.h:314 [inline]
  ipv6_rcv+0xda/0x390 net/ipv6/ip6_input.c:310
  __netif_receive_skb_one_core net/core/dev.c:5532 [inline]
  __netif_receive_skb+0x1a6/0x5a0 net/core/dev.c:5646
  netif_receive_skb_internal net/core/dev.c:5732 [inline]
  netif_receive_skb+0x58/0x660 net/core/dev.c:5791
  tun_rx_batched+0x3ee/0x980 drivers/net/tun.c:1555
  tun_get_user+0x53af/0x66d0 drivers/net/tun.c:2002
  tun_chr_write_iter+0x3af/0x5d0 drivers/net/tun.c:2048
  call_write_iter include/linux/fs.h:2020 [inline]
  new_sync_write fs/read_write.c:491 [inline]
  vfs_write+0x8ef/0x1490 fs/read_write.c:584
  ksys_write+0x20f/0x4c0 fs/read_write.c:637
  __do_sys_write fs/read_write.c:649 [inline]
  __se_sys_write fs/read_write.c:646 [inline]
  __x64_sys_write+0x93/0xd0 fs/read_write.c:646
  do_syscall_x64 arch/x86/entry/common.c:52 [inline]
  do_syscall_64+0x44/0x110 arch/x86/entry/common.c:83
 entry_SYSCALL_64_after_hwframe+0x63/0x6b

Local variable mp_opt created at:
  subflow_check_req+0x6d/0x15d0 net/mptcp/subflow.c:145
  subflow_v6_route_req+0x269/0x410 net/mptcp/subflow.c:367

CPU: 1 PID: 5924 Comm: syz-executor.3 Not tainted 6.7.0-rc8-syzkaller-00055-g5eff55d725a4 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 11/17/2023

Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Peter Krystad <peter.krystad@linux.intel.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Mat Martineau <martineau@kernel.org>
Cc: Geliang Tang <geliang.tang@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20240111194917.4044654-5-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-12 18:14:22 -08:00
Eric Dumazet
be1d9d9d38 mptcp: use OPTION_MPTCP_MPJ_SYNACK in subflow_finish_connect()
subflow_finish_connect() uses four fields (backup, join_id, thmac, none)
that may contain garbage unless OPTION_MPTCP_MPJ_SYNACK has been set
in mptcp_parse_option()

Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Peter Krystad <peter.krystad@linux.intel.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Mat Martineau <martineau@kernel.org>
Cc: Geliang Tang <geliang.tang@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20240111194917.4044654-4-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-12 18:14:21 -08:00
Eric Dumazet
c1665273bd mptcp: strict validation before using mp_opt->hmac
mp_opt->hmac contains uninitialized data unless OPTION_MPTCP_MPJ_ACK
was set in mptcp_parse_option().

We must refine the condition before we call subflow_hmac_valid().

Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Peter Krystad <peter.krystad@linux.intel.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Mat Martineau <martineau@kernel.org>
Cc: Geliang Tang <geliang.tang@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20240111194917.4044654-3-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-12 18:14:21 -08:00
Eric Dumazet
89e23277f9 mptcp: mptcp_parse_option() fix for MPTCPOPT_MP_JOIN
mptcp_parse_option() currently sets OPTIONS_MPTCP_MPJ, for the three
possible cases handled for MPTCPOPT_MP_JOIN option.

OPTIONS_MPTCP_MPJ is the combination of three flags:
- OPTION_MPTCP_MPJ_SYN
- OPTION_MPTCP_MPJ_SYNACK
- OPTION_MPTCP_MPJ_ACK

This is a problem, because backup, join_id, token, nonce and/or hmac fields
could be left uninitialized in some cases.

Distinguish the three cases, as following patches will need this step.

Fixes: f296234c98a8 ("mptcp: Add handling of incoming MP_JOIN requests")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Florian Westphal <fw@strlen.de>
Cc: Peter Krystad <peter.krystad@linux.intel.com>
Cc: Matthieu Baerts <matttbe@kernel.org>
Cc: Mat Martineau <martineau@kernel.org>
Cc: Geliang Tang <geliang.tang@linux.dev>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Link: https://lore.kernel.org/r/20240111194917.4044654-2-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-12 18:14:21 -08:00
Dmitry Antipov
cbdd50ec8b net: liquidio: fix clang-specific W=1 build warnings
When compiling with clang-18 and W=1, I've noticed the following
warnings:

drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c:1493:16: warning: cast
from 'void (*)(struct octeon_device *, struct octeon_mbox_cmd *, void *)' to
'octeon_mbox_callback_t' (aka 'void (*)(void *, void *, void *)') converts to
incompatible function type [-Wcast-function-type-strict]
 1493 |         mbox_cmd.fn = (octeon_mbox_callback_t)cn23xx_get_vf_stats_callback;
      |                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

and:

drivers/net/ethernet/cavium/liquidio/cn23xx_vf_device.c:432:16: warning: cast
from 'void (*)(struct octeon_device *, struct octeon_mbox_cmd *, void *)' to
'octeon_mbox_callback_t' (aka 'void (*)(void *, void *, void *)') converts to
incompatible function type [-Wcast-function-type-strict]
  432 |         mbox_cmd.fn = (octeon_mbox_callback_t)octeon_pfvf_hs_callback;
      |                       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Fix both of the above by adjusting 'octeon_mbox_callback_t' to match actual
callback definitions (at the cost of adding an extra forward declaration).

Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240111162432.124014-1-dmantipov@yandex.ru
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-12 17:57:09 -08:00
Jakub Kicinski
907ee66817 net: fill in MODULE_DESCRIPTION()s for wx_lib
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add a description to Wangxun's common code lib.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-12 12:17:37 +00:00
Claudiu Beznea
e398822c47 net: phy: micrel: populate .soft_reset for KSZ9131
The RZ/G3S SMARC Module has 2 KSZ9131 PHYs. In this setup, the KSZ9131 PHY
is used with the ravb Ethernet driver. It has been discovered that when
bringing the Ethernet interface down/up continuously, e.g., with the
following sh script:

$ while :; do ifconfig eth0 down; ifconfig eth0 up; done

the link speed and duplex are wrong after interrupting the bring down/up
operation even though the Ethernet interface is up. To recover from this
state the following configuration sequence is necessary (executed
manually):

$ ifconfig eth0 down
$ ifconfig eth0 up

The behavior has been identified also on the Microchip SAMA7G5-EK board
which runs the macb driver and uses the same PHY.

The order of PHY-related operations in ravb_open() is as follows:
ravb_open() ->
  ravb_phy_start() ->
    ravb_phy_init() ->
      of_phy_connect() ->
        phy_connect_direct() ->
	  phy_attach_direct() ->
	    phy_init_hw() ->
	      phydev->drv->soft_reset()
	      phydev->drv->config_init()
	      phydev->drv->config_intr()
	    phy_resume()
	      kszphy_resume()

The order of PHY-related operations in ravb_close is as follows:
ravb_close() ->
  phy_stop() ->
    phy_suspend() ->
      kszphy_suspend() ->
        genphy_suspend()
	  // set BMCR_PDOWN bit in MII_BMCR

In genphy_suspend() setting the BMCR_PDWN bit in MII_BMCR switches the PHY
to Software Power-Down (SPD) mode (according to the KSZ9131 datasheet).
Thus, when opening the interface after it has been  previously closed (via
ravb_close()), the phydev->drv->config_init() and
phydev->drv->config_intr() reach the KSZ9131 PHY driver via the
ksz9131_config_init() and kszphy_config_intr() functions.

KSZ9131 specifies that the MII management interface remains operational
during SPD (Software Power-Down), but (according to manual):
- Only access to the standard registers (0 through 31) is supported.
- Access to MMD address spaces other than MMD address space 1 is possible
  if the spd_clock_gate_override bit is set.
- Access to MMD address space 1 is not possible.

The spd_clock_gate_override bit is not used in the KSZ9131 driver.

ksz9131_config_init() configures RGMII delay, pad skews and LEDs by
accessesing MMD registers other than those in address space 1.

The datasheet for the KSZ9131 does not specify what happens if registers
from an unsupported address space are accessed while the PHY is in SPD.

To fix the issue the .soft_reset method has been instantiated for KSZ9131,
too. This resets the PHY to the default state before doing any
configurations to it, thus switching it out of SPD.

Fixes: bff5b4b37372 ("net: phy: micrel: add Microchip KSZ9131 initial driver")
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj@bp.renesas.com>
Reviewed-by: Maxime Chevallier <maxime.chevallier@bootlin.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-12 11:09:12 +00:00
Horatiu Vultur
acd66c2126 net: micrel: Fix PTP frame parsing for lan8841
The HW has the capability to check each frame if it is a PTP frame,
which domain it is, which ptp frame type it is, different ip address in
the frame. And if one of these checks fail then the frame is not
timestamp. Most of these checks were disabled except checking the field
minorVersionPTP inside the PTP header. Meaning that once a partner sends
a frame compliant to 8021AS which has minorVersionPTP set to 1, then the
frame was not timestamp because the HW expected by default a value of 0
in minorVersionPTP.
Fix this issue by removing this check so the userspace can decide on this.

Fixes: cafc3662ee3f ("net: micrel: Add PHC support for lan8841")
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Divya Koppera <divya.koppera@microchip.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-12 11:08:08 +00:00
Taehee Yoo
bec161add3 amt: do not use overwrapped cb area
amt driver uses skb->cb for storing tunnel information.
This job is worked before TC layer and then amt driver load tunnel info
from skb->cb after TC layer.
So, its cb area should not be overwrapped with CB area used by TC.
In order to not use cb area used by TC, it skips the biggest cb
structure used by TC, which was qdisc_skb_cb.
But it's not anymore.
Currently, biggest structure of TC's CB is tc_skb_cb.
So, it should skip size of tc_skb_cb instead of qdisc_skb_cb.

Fixes: ec624fe740b4 ("net/sched: Extend qdisc control block with tc control block")
Signed-off-by: Taehee Yoo <ap420073@gmail.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20240107144241.4169520-1-ap420073@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:55:17 -08:00
Jakub Kicinski
66cee759ff Merge branch 'net-ethernet-ti-am65-cpsw-allow-for-mtu-values'
Sanjuán García, Jorge says:

====================
net: ethernet: ti: am65-cpsw: Allow for MTU values

The am65-cpsw-nuss driver has a fixed definition for the maximum ethernet
frame length of 1522 bytes (AM65_CPSW_MAX_PACKET_SIZE). This limits the switch
ports to only operate at a maximum MTU of 1500 bytes. When combining this CPSW
switch with a DSA switch connected to one of its ports this limitation shows up.
The extra 8 bytes the DSA subsystem adds internally to the ethernet frame
create resulting frames bigger than 1522 bytes (1518 for non VLAN + 8 for DSA
stuff) so they get dropped by the switch.

One of the issues with the the am65-cpsw-nuss driver is that the network device
max_mtu was being set to the same fixed value defined for the max total frame
length (1522 bytes). This makes the DSA subsystem believe that the MTU of the
interface can be set to 1508 bytes to make room for the extra 8 bytes of the DSA
headers. However, all packages created assuming the 1500 bytes payload get
dropped by the switch as oversized.

This series offers a solution to this problem. The max_mtu advertised on the
network device and the actual max frame size configured on the switch registers
are made consistent by letting the extra room needed for the ethernet headers
and the frame checksum (22 bytes including VLAN).
====================

Link: https://lore.kernel.org/r/20240105085530.14070-1-jorge.sanjuangarcia@duagon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:54:47 -08:00
Sanjuán García, Jorge
64e47d8afb net: ethernet: ti: am65-cpsw: Fix max mtu to fit ethernet frames
The value of AM65_CPSW_MAX_PACKET_SIZE represents the maximum length
of a received frame. This value is written to the register
AM65_CPSW_PORT_REG_RX_MAXLEN.

The maximum MTU configured on the network device should then leave
some room for the ethernet headers and frame check. Otherwise, if
the network interface is configured to its maximum mtu possible,
the frames will be larger than AM65_CPSW_MAX_PACKET_SIZE and will
get dropped as oversized.

The switch supports ethernet frame sizes between 64 and 2024 bytes
(including VLAN) as stated in the technical reference manual, so
define AM65_CPSW_MAX_PACKET_SIZE with that maximum size.

Fixes: 93a76530316a ("net: ethernet: ti: introduce am65x/j721e gigabit eth subsystem driver")
Signed-off-by: Jorge Sanjuan Garcia <jorge.sanjuangarcia@duagon.com>
Reviewed-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Link: https://lore.kernel.org/r/20240105085530.14070-2-jorge.sanjuangarcia@duagon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:54:45 -08:00
Zhu Yanjun
e3fe8d28c6 virtio_net: Fix "‘%d’ directive writing between 1 and 11 bytes into a region of size 10" warnings
Fix the warnings when building virtio_net driver.

"
drivers/net/virtio_net.c: In function ‘init_vqs’:
drivers/net/virtio_net.c:4551:48: warning: ‘%d’ directive writing between 1 and 11 bytes into a region of size 10 [-Wformat-overflow=]
 4551 |                 sprintf(vi->rq[i].name, "input.%d", i);
      |                                                ^~
In function ‘virtnet_find_vqs’,
    inlined from ‘init_vqs’ at drivers/net/virtio_net.c:4645:8:
drivers/net/virtio_net.c:4551:41: note: directive argument in the range [-2147483643, 65534]
 4551 |                 sprintf(vi->rq[i].name, "input.%d", i);
      |                                         ^~~~~~~~~~
drivers/net/virtio_net.c:4551:17: note: ‘sprintf’ output between 8 and 18 bytes into a destination of size 16
 4551 |                 sprintf(vi->rq[i].name, "input.%d", i);
      |                 ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
drivers/net/virtio_net.c: In function ‘init_vqs’:
drivers/net/virtio_net.c:4552:49: warning: ‘%d’ directive writing between 1 and 11 bytes into a region of size 9 [-Wformat-overflow=]
 4552 |                 sprintf(vi->sq[i].name, "output.%d", i);
      |                                                 ^~
In function ‘virtnet_find_vqs’,
    inlined from ‘init_vqs’ at drivers/net/virtio_net.c:4645:8:
drivers/net/virtio_net.c:4552:41: note: directive argument in the range [-2147483643, 65534]
 4552 |                 sprintf(vi->sq[i].name, "output.%d", i);
      |                                         ^~~~~~~~~~~
drivers/net/virtio_net.c:4552:17: note: ‘sprintf’ output between 9 and 19 bytes into a destination of size 16
 4552 |                 sprintf(vi->sq[i].name, "output.%d", i);

"

Reviewed-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@linux.dev>
Link: https://lore.kernel.org/r/20240104020902.2753599-1-yanjun.zhu@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:54:34 -08:00
Nithin Dabilpuram
a0cb76a770 octeontx2-af: CN10KB: Fix FIFO length calculation for RPM2
RPM0 and RPM1 on the CN10KB SoC have 8 LMACs each, whereas RPM2
has only 4 LMACs. Similarly, the RPM0 and RPM1 have 256KB FIFO,
whereas RPM2 has 128KB FIFO. This patch fixes an issue with
improper TX credit programming for the RPM2 link.

Fixes: b9d0fedc6234 ("octeontx2-af: cn10kb: Add RPM_USX MAC support")
Signed-off-by: Nithin Dabilpuram <ndabilpuram@marvell.com>
Signed-off-by: Naveen Mamindlapalli <naveenm@marvell.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240108073036.8766-1-naveenm@marvell.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:48:21 -08:00
Jakub Kicinski
3722a98752 Merge branch 'rtnetlink-allow-to-enslave-with-one-msg-an-up-interface'
Nicolas Dichtel says:

====================
rtnetlink: allow to enslave with one msg an up interface

The first patch fixes a regression, introduced in linux v6.1, by reverting
a patch. The second patch adds a test to verify this API.
====================

Link: https://lore.kernel.org/r/20240108094103.2001224-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:47:44 -08:00
Nicolas Dichtel
a159cbe81d selftests: rtnetlink: check enslaving iface in a bond
The goal is to check the following two sequences:
> ip link set dummy0 up
> ip link set dummy0 master bond0 down

Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240108094103.2001224-3-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:47:40 -08:00
Nicolas Dichtel
ec4ffd100f Revert "net: rtnetlink: Enslave device before bringing it up"
This reverts commit a4abfa627c3865c37e036bccb681619a50d3d93c.

The patch broke:
> ip link set dummy0 up
> ip link set dummy0 master bond0 down

This last command is useful to be able to enslave an interface with only
one netlink message.

After discussion, there is no good reason to support:
> ip link set dummy0 down
> ip link set dummy0 master bond0 up
because the bond interface already set the slave up when it is up.

Cc: stable@vger.kernel.org
Fixes: a4abfa627c38 ("net: rtnetlink: Enslave device before bringing it up")
Signed-off-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Hangbin Liu <liuhangbin@gmail.com>
Link: https://lore.kernel.org/r/20240108094103.2001224-2-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:47:40 -08:00
David Howells
8722014311 rxrpc: Fix use of Don't Fragment flag
rxrpc normally has the Don't Fragment flag set on the UDP packets it
transmits, except when it has decided that DATA packets aren't getting
through - in which case it turns it off just for the DATA transmissions.
This can be a problem, however, for RESPONSE packets that convey
authentication and crypto data from the client to the server as ticket may
be larger than can fit in the MTU.

In such a case, rxrpc gets itself into an infinite loop as the sendmsg
returns an error (EMSGSIZE), which causes rxkad_send_response() to return
-EAGAIN - and the CHALLENGE packet is put back on the Rx queue to retry,
leading to the I/O thread endlessly attempting to perform the transmission.

Fix this by disabling DF on RESPONSE packets for now.  The use of DF and
best data MTU determination needs reconsidering at some point in the
future.

Fixes: 17926a79320a ("[AF_RXRPC]: Provide secure RxRPC sockets for use by userspace and kernel both")
Reported-by: Marc Dionne <marc.dionne@auristor.com>
Signed-off-by: David Howells <dhowells@redhat.com>
cc: linux-afs@lists.infradead.org
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/1581852.1704813048@warthog.procyon.org.uk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:41:41 -08:00
Vladimir Oltean
844f104790 net: dsa: fix netdev_priv() dereference before check on non-DSA netdevice events
After the blamed commit, we started doing this dereference for every
NETDEV_CHANGEUPPER and NETDEV_PRECHANGEUPPER event in the system.

static inline struct dsa_port *dsa_user_to_port(const struct net_device *dev)
{
	struct dsa_user_priv *p = netdev_priv(dev);

	return p->dp;
}

Which is obviously bogus, because not all net_devices have a netdev_priv()
of type struct dsa_user_priv. But struct dsa_user_priv is fairly small,
and p->dp means dereferencing 8 bytes starting with offset 16. Most
drivers allocate that much private memory anyway, making our access not
fault, and we discard the bogus data quickly afterwards, so this wasn't
caught.

But the dummy interface is somewhat special in that it calls
alloc_netdev() with a priv size of 0. So every netdev_priv() dereference
is invalid, and we get this when we emit a NETDEV_PRECHANGEUPPER event
with a VLAN as its new upper:

$ ip link add dummy1 type dummy
$ ip link add link dummy1 name dummy1.100 type vlan id 100
[   43.309174] ==================================================================
[   43.316456] BUG: KASAN: slab-out-of-bounds in dsa_user_prechangeupper+0x30/0xe8
[   43.323835] Read of size 8 at addr ffff3f86481d2990 by task ip/374
[   43.330058]
[   43.342436] Call trace:
[   43.366542]  dsa_user_prechangeupper+0x30/0xe8
[   43.371024]  dsa_user_netdevice_event+0xb38/0xee8
[   43.375768]  notifier_call_chain+0xa4/0x210
[   43.379985]  raw_notifier_call_chain+0x24/0x38
[   43.384464]  __netdev_upper_dev_link+0x3ec/0x5d8
[   43.389120]  netdev_upper_dev_link+0x70/0xa8
[   43.393424]  register_vlan_dev+0x1bc/0x310
[   43.397554]  vlan_newlink+0x210/0x248
[   43.401247]  rtnl_newlink+0x9fc/0xe30
[   43.404942]  rtnetlink_rcv_msg+0x378/0x580

Avoid the kernel oops by dereferencing after the type check, as customary.

Fixes: 4c3f80d22b2e ("net: dsa: walk through all changeupper notifier functions")
Reported-and-tested-by: syzbot+d81bcd883824180500c8@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/netdev/0000000000001d4255060e87545c@google.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240110003354.2796778-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:33:52 -08:00
Lin Ma
b33fb5b801 net: qualcomm: rmnet: fix global oob in rmnet_policy
The variable rmnet_link_ops assign a *bigger* maxtype which leads to a
global out-of-bounds read when parsing the netlink attributes. See bug
trace below:

==================================================================
BUG: KASAN: global-out-of-bounds in validate_nla lib/nlattr.c:386 [inline]
BUG: KASAN: global-out-of-bounds in __nla_validate_parse+0x24af/0x2750 lib/nlattr.c:600
Read of size 1 at addr ffffffff92c438d0 by task syz-executor.6/84207

CPU: 0 PID: 84207 Comm: syz-executor.6 Tainted: G                 N 6.1.0 #3
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
Call Trace:
 <TASK>
 __dump_stack lib/dump_stack.c:88 [inline]
 dump_stack_lvl+0x8b/0xb3 lib/dump_stack.c:106
 print_address_description mm/kasan/report.c:284 [inline]
 print_report+0x172/0x475 mm/kasan/report.c:395
 kasan_report+0xbb/0x1c0 mm/kasan/report.c:495
 validate_nla lib/nlattr.c:386 [inline]
 __nla_validate_parse+0x24af/0x2750 lib/nlattr.c:600
 __nla_parse+0x3e/0x50 lib/nlattr.c:697
 nla_parse_nested_deprecated include/net/netlink.h:1248 [inline]
 __rtnl_newlink+0x50a/0x1880 net/core/rtnetlink.c:3485
 rtnl_newlink+0x64/0xa0 net/core/rtnetlink.c:3594
 rtnetlink_rcv_msg+0x43c/0xd70 net/core/rtnetlink.c:6091
 netlink_rcv_skb+0x14f/0x410 net/netlink/af_netlink.c:2540
 netlink_unicast_kernel net/netlink/af_netlink.c:1319 [inline]
 netlink_unicast+0x54e/0x800 net/netlink/af_netlink.c:1345
 netlink_sendmsg+0x930/0xe50 net/netlink/af_netlink.c:1921
 sock_sendmsg_nosec net/socket.c:714 [inline]
 sock_sendmsg+0x154/0x190 net/socket.c:734
 ____sys_sendmsg+0x6df/0x840 net/socket.c:2482
 ___sys_sendmsg+0x110/0x1b0 net/socket.c:2536
 __sys_sendmsg+0xf3/0x1c0 net/socket.c:2565
 do_syscall_x64 arch/x86/entry/common.c:50 [inline]
 do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80
 entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fdcf2072359
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 f1 19 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007fdcf13e3168 EFLAGS: 00000246 ORIG_RAX: 000000000000002e
RAX: ffffffffffffffda RBX: 00007fdcf219ff80 RCX: 00007fdcf2072359
RDX: 0000000000000000 RSI: 0000000020000200 RDI: 0000000000000003
RBP: 00007fdcf20bd493 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 00007fffbb8d7bdf R14: 00007fdcf13e3300 R15: 0000000000022000
 </TASK>

The buggy address belongs to the variable:
 rmnet_policy+0x30/0xe0

The buggy address belongs to the physical page:
page:0000000065bdeb3c refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x155243
flags: 0x200000000001000(reserved|node=0|zone=2)
raw: 0200000000001000 ffffea00055490c8 ffffea00055490c8 0000000000000000
raw: 0000000000000000 0000000000000000 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
 ffffffff92c43780: f9 f9 f9 f9 00 00 00 02 f9 f9 f9 f9 00 00 00 07
 ffffffff92c43800: f9 f9 f9 f9 00 00 00 05 f9 f9 f9 f9 06 f9 f9 f9
>ffffffff92c43880: f9 f9 f9 f9 00 00 00 00 00 00 f9 f9 f9 f9 f9 f9
                                                 ^
 ffffffff92c43900: 00 00 00 00 00 00 00 00 07 f9 f9 f9 f9 f9 f9 f9
 ffffffff92c43980: 00 00 00 07 f9 f9 f9 f9 00 00 00 05 f9 f9 f9 f9

According to the comment of `nla_parse_nested_deprecated`, the maxtype
should be len(destination array) - 1. Hence use `IFLA_RMNET_MAX` here.

Fixes: 14452ca3b5ce ("net: qualcomm: rmnet: Export mux_id and flags to netlink")
Signed-off-by: Lin Ma <linma@zju.edu.cn>
Reviewed-by: Subash Abhinov Kasiviswanathan <quic_subashab@quicinc.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240110061400.3356108-1-linma@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:32:03 -08:00
Dmitry Safonov
e689a87696 selftests/net/tcp-ao: Use LDLIBS instead of LDFLAGS
The rules to link selftests are:

> $(OUTPUT)/%_ipv4: %.c
> 	$(LINK.c) $^ $(LDLIBS) -o $@
>
> $(OUTPUT)/%_ipv6: %.c
> 	$(LINK.c) -DIPV6_TEST $^ $(LDLIBS) -o $@

The intel test robot uses only selftest's Makefile, not the top linux
Makefile:

> make W=1 O=/tmp/kselftest -C tools/testing/selftests

So, $(LINK.c) is determined by environment, rather than by kernel
Makefiles. On my machine (as well as other people that ran tcp-ao
selftests) GNU/Make implicit definition does use $(LDFLAGS):

> [dima@Mindolluin ~]$ make -p -f/dev/null | grep '^LINK.c\>'
> make: *** No targets.  Stop.
> LINK.c = $(CC) $(CFLAGS) $(CPPFLAGS) $(LDFLAGS) $(TARGET_ARCH)

But, according to build robot report, it's not the case for them.
While I could just avoid using pre-defined $(LINK.c), it's also used by
selftests/lib.mk by default.

Anyways, according to GNU/Make documentation [1], I should have used
$(LDLIBS) instead of $(LDFLAGS) in the first place, so let's just do it:

> LDFLAGS
>     Extra flags to give to compilers when they are supposed to invoke
>     the linker, ‘ld’, such as -L. Libraries (-lfoo) should be added
>     to the LDLIBS variable instead.
> LDLIBS
>     Library flags or names given to compilers when they are supposed
>     to invoke the linker, ‘ld’. LOADLIBES is a deprecated (but still
>     supported) alternative to LDLIBS. Non-library linker flags, such
>     as -L, should go in the LDFLAGS variable.

[1]: https://www.gnu.org/software/make/manual/html_node/Implicit-Variables.html

Fixes: cfbab37b3da0 ("selftests/net: Add TCP-AO library")
Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202401011151.veyYTJzq-lkp@intel.com/
Signed-off-by: Dmitry Safonov <dima@arista.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Link: https://lore.kernel.org/r/20240110-tcp_ao-selftests-makefile-v1-1-aa07d043f052@arista.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:31:04 -08:00
Arnd Bergmann
b3739fb3a9 wangxunx: select CONFIG_PHYLINK where needed
The ngbe driver needs phylink:

arm-linux-gnueabi-ld: drivers/net/ethernet/wangxun/libwx/wx_ethtool.o: in function `wx_nway_reset':
wx_ethtool.c:(.text+0x458): undefined reference to `phylink_ethtool_nway_reset'
arm-linux-gnueabi-ld: drivers/net/ethernet/wangxun/ngbe/ngbe_main.o: in function `ngbe_remove':
ngbe_main.c:(.text+0x7c): undefined reference to `phylink_destroy'
arm-linux-gnueabi-ld: drivers/net/ethernet/wangxun/ngbe/ngbe_main.o: in function `ngbe_open':
ngbe_main.c:(.text+0xf90): undefined reference to `phylink_connect_phy'
arm-linux-gnueabi-ld: drivers/net/ethernet/wangxun/ngbe/ngbe_mdio.o: in function `ngbe_mdio_init':
ngbe_mdio.c:(.text+0x314): undefined reference to `phylink_create'

Add the missing Kconfig description for this.

Fixes: bc2426d74aa3 ("net: ngbe: convert phylib to phylink")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240111162828.68564-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:26:36 -08:00
Jakub Kicinski
f9678f5825 MAINTAINERS: ibmvnic: drop Dany from reviewers
I missed that Dany uses a different email address
when tagging patches (drt@linux.ibm.com)
and asked him if he's still actively working on ibmvnic.
He doesn't really fall under our removal criteria,
but he admitted that he already moved on to other projects.

Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240109164517.3063131-8-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:25:04 -08:00
Jakub Kicinski
bd93edbfd7 MAINTAINERS: mark ax25 as Orphan
We haven't heard from Ralf for two years, according to lore.
We get a constant stream of "fixes" to ax25 from people using
code analysis tools. Nobody is reviewing those, let's reflect
this reality in MAINTAINERS.

Subsystem AX.25 NETWORK LAYER
  Changes 9 / 59 (15%)
  (No activity)
  Top reviewers:
    [2]: mkl@pengutronix.de
    [2]: edumazet@google.com
    [2]: stefan@datenfreihafen.org
  INACTIVE MAINTAINER Ralf Baechle <ralf@linux-mips.org>

Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240109164517.3063131-7-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:25:04 -08:00
Jakub Kicinski
0bfcdce867 MAINTAINERS: Bluetooth: retire Johan (for now?)
Johan moved to maintaining the Zephyr Bluetooth stack,
and we haven't heard from him on the ML in 3 years
(according to lore), and seen any tags in git in 4 years.
Trade the MAINTAINER entry for CREDITS, we can revert
whenever Johan comes back to Linux hacking :)

Subsystem BLUETOOTH SUBSYSTEM
  Changes 173 / 986 (17%)
  Last activity: 2023-12-22
  Marcel Holtmann <marcel@holtmann.org>:
    Author 91cb4c19118a 2022-01-27 00:00:00 52
    Committer edcb185fa9c4 2022-05-23 00:00:00 446
    Tags 000c2fa2c144 2023-04-23 00:00:00 523
  Johan Hedberg <johan.hedberg@gmail.com>:
  Luiz Augusto von Dentz <luiz.dentz@gmail.com>:
    Author d03376c18592 2023-12-22 00:00:00 241
    Committer da9065caa594 2023-12-22 00:00:00 341
    Tags da9065caa594 2023-12-22 00:00:00 493
  Top reviewers:
    [33]: alainm@chromium.org
    [31]: mcchou@chromium.org
    [27]: abhishekpandit@chromium.org
  INACTIVE MAINTAINER Johan Hedberg <johan.hedberg@gmail.com>

Link: https://lore.kernel.org/r/20240109164517.3063131-6-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:25:03 -08:00
Jakub Kicinski
384a35866f MAINTAINERS: eth: mark Cavium liquidio as an Orphan
We haven't seen much review activity from the liquidio
maintainers for years. Reflect that reality in MAINTAINERS.
Our scripts report:

Subsystem CAVIUM LIQUIDIO NETWORK DRIVER
  Changes 30 / 87 (34%)
  Last activity: 2019-01-28
  Derek Chickles <dchickles@marvell.com>:
    Tags ac93e2fa8550 2019-01-28 00:00:00 1
  Satanand Burla <sburla@marvell.com>:
  Felix Manlunas <fmanlunas@marvell.com>:
    Tags ac93e2fa8550 2019-01-28 00:00:00 1
  Top reviewers:
    [5]: simon.horman@corigine.com
    [4]: keescook@chromium.org
    [4]: jiri@nvidia.com
  INACTIVE MAINTAINER Satanand Burla <sburla@marvell.com>

Acked-by: Derek Chickles <dchickles@marvell.com>
Link: https://lore.kernel.org/r/20240109164517.3063131-5-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:24:57 -08:00
Jakub Kicinski
009a98bca6 MAINTAINERS: eth: mvneta: move Thomas to CREDITS
Thomas is still active in other bits of the kernel and beyond
but not as much on the Marvell Ethernet devices.
Our scripts report:

Subsystem MARVELL MVNETA ETHERNET DRIVER
  Changes 54 / 176 (30%)
  (No activity)
  Top reviewers:
    [12]: hawk@kernel.org
    [9]: toke@redhat.com
    [9]: john.fastabend@gmail.com
  INACTIVE MAINTAINER Thomas Petazzoni <thomas.petazzoni@bootlin.com>

Acked-by: Thomas Petazzoni <thomas.petazzoni@bootlin.com>
Link: https://lore.kernel.org/r/20240109164517.3063131-4-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:24:11 -08:00
Jakub Kicinski
b59d8485fe MAINTAINERS: eth: mt7530: move Landen Chao to CREDITS
mt7530 is a pretty active driver and last we have heard
from Landen Chao on the list was March. There were total
of 4 message from them in the last 2.5 years.
I think it's time to move to CREDITS.

Subsystem MEDIATEK SWITCH DRIVER
  Changes 94 / 169 (55%)
  Last activity: 2023-10-11
  Arınç ÜNAL <arinc.unal@arinc9.com>:
    Author e94b590abfff 2023-08-19 00:00:00 12
    Tags e94b590abfff 2023-08-19 00:00:00 16
  Daniel Golle <daniel@makrotopia.org>:
    Author 91daa4f62ce8 2023-04-19 00:00:00 17
    Tags ac49b992578d 2023-10-11 00:00:00 20
  Landen Chao <Landen.Chao@mediatek.com>:
  DENG Qingfang <dqfext@gmail.com>:
    Author 342afce10d6f 2021-10-18 00:00:00 24
    Tags 342afce10d6f 2021-10-18 00:00:00 25
  Sean Wang <sean.wang@mediatek.com>:
    Tags c288575f7810 2020-09-14 00:00:00 5
  Top reviewers:
    [46]: f.fainelli@gmail.com
    [29]: andrew@lunn.ch
    [19]: olteanv@gmail.com
  INACTIVE MAINTAINER Landen Chao <Landen.Chao@mediatek.com>

Acked-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Link: https://lore.kernel.org/r/20240109164517.3063131-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:24:11 -08:00
Jakub Kicinski
da14d1fed9 MAINTAINERS: eth: mtk: move John to CREDITS
John is still active in other bits of the kernel but not much
on the MediaTek ethernet switch side. Our scripts report:

Subsystem MEDIATEK ETHERNET DRIVER
  Changes 81 / 384 (21%)
  Last activity: 2023-12-21
  Felix Fietkau <nbd@nbd.name>:
    Author c6d96df9fa2c 2023-05-02 00:00:00 42
    Tags c6d96df9fa2c 2023-05-02 00:00:00 48
  John Crispin <john@phrozen.org>:
  Sean Wang <sean.wang@mediatek.com>:
    Author 880c2d4b2fdf 2019-06-03 00:00:00 5
    Tags a5d75538295b 2020-04-07 00:00:00 7
  Mark Lee <Mark-MC.Lee@mediatek.com>:
    Author 8d66a8183d0c 2019-11-14 00:00:00 4
    Tags 8d66a8183d0c 2019-11-14 00:00:00 4
  Lorenzo Bianconi <lorenzo@kernel.org>:
    Author 7cb8cd4daacf 2023-12-21 00:00:00 98
    Tags 7cb8cd4daacf 2023-12-21 00:00:00 112
  Top reviewers:
    [18]: horms@kernel.org
    [15]: leonro@nvidia.com
    [8]: rmk+kernel@armlinux.org.uk
  INACTIVE MAINTAINER John Crispin <john@phrozen.org>

Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: John Crispin <john@phrozen.org>
Link: https://lore.kernel.org/r/20240109164517.3063131-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:24:06 -08:00
Jakub Kicinski
5ecba0101d Merge branch 'fix-module_description-for-net-p1'
Breno Leitao says:

====================
Fix MODULE_DESCRIPTION() for net (p1)

There are hundreds of network modules that misses MODULE_DESCRIPTION(),
causing a warnning when compiling with W=1. Example:

	WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/arcnet/com90io.o
	WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/arcnet/arc-rimi.o
	WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/arcnet/com20020.o
====================

Link: https://lore.kernel.org/r/20240108181610.2697017-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:16:09 -08:00
Breno Leitao
c155eca076 net: fill in MODULE_DESCRIPTION()s for s2io
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the Neterion 10GbE (s2io) driver.

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240108181610.2697017-10-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:16:08 -08:00
Breno Leitao
ade9875612 net: fill in MODULE_DESCRIPTION()s for ds26522 module
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to Slic Maxim ds26522 card driver.

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240108181610.2697017-9-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:16:08 -08:00
Breno Leitao
d8610e431f net: fill in MODULE_DESCRIPTION()s for Sun RPC
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to Sun RPC modules.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
Link: https://lore.kernel.org/r/20240108181610.2697017-6-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:16:08 -08:00
Breno Leitao
95c236cc5f net: fill in MODULE_DESCRIPTION()s for NFC
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to all NFC Controller Interface (NCI) modules.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Krzysztof Kozlowski <krzysztof.kozlowski@linaro.org>
Link: https://lore.kernel.org/r/20240108181610.2697017-5-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:16:08 -08:00
Breno Leitao
417d8c571c net: fill in MODULE_DESCRIPTION()s for HSR
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to High-availability Seamless Redundancy (HSR) driver.

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240108181610.2697017-4-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:16:08 -08:00
Breno Leitao
e1b1d282d5 net: fill in MODULE_DESCRIPTION()s for SLIP
W=1 builds now warn if module is built without a MODULE_DESCRIPTION().
Add descriptions to the Serial Line (SLIP) protocol modules.

Signed-off-by: Breno Leitao <leitao@debian.org>
Link: https://lore.kernel.org/r/20240108181610.2697017-3-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-11 16:16:08 -08:00
Linus Torvalds
3e7aeb78ab Networking changes for 6.8.
Core & protocols
 ----------------
 
  - Analyze and reorganize core networking structs (socks, netdev,
    netns, mibs) to optimize cacheline consumption and set up
    build time warnings to safeguard against future header changes.
    This improves TCP performances with many concurrent connections
    up to 40%.
 
  - Add page-pool netlink-based introspection, exposing the
    memory usage and recycling stats. This helps indentify
    bad PP users and possible leaks.
 
  - Refine TCP/DCCP source port selection to no longer favor even
    source port at connect() time when IP_LOCAL_PORT_RANGE is set.
    This lowers the time taken by connect() for hosts having
    many active connections to the same destination.
 
  - Refactor the TCP bind conflict code, shrinking related socket
    structs.
 
  - Refactor TCP SYN-Cookie handling, as a preparation step to
    allow arbitrary SYN-Cookie processing via eBPF.
 
  - Tune optmem_max for 0-copy usage, increasing the default value
    to 128KB and namespecifying it.
 
  - Allow coalescing for cloned skbs coming from page pools, improving
    RX performances with some common configurations.
 
  - Reduce extension header parsing overhead at GRO time.
 
  - Add bridge MDB bulk deletion support, allowing user-space to
    request the deletion of matching entries.
 
  - Reorder nftables struct members, to keep data accessed by the
    datapath first.
 
  - Introduce TC block ports tracking and use. This allows supporting
    multicast-like behavior at the TC layer.
 
  - Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and
    classifiers (RSVP and tcindex).
 
  - More data-race annotations.
 
  - Extend the diag interface to dump TCP bound-only sockets.
 
  - Conditional notification of events for TC qdisc class and actions.
 
  - Support for WPAN dynamic associations with nearby devices, to form
    a sub-network using a specific PAN ID.
 
  - Implement SMCv2.1 virtual ISM device support.
 
  - Add support for Batman-avd mulicast packet type.
 
 BPF
 ---
 
  - Tons of verifier improvements:
    - BPF register bounds logic and range support along with a large
      test suite
    - log improvements
    - complete precision tracking support for register spills
    - track aligned STACK_ZERO cases as imprecise spilled registers. It
      improves the verifier "instructions processed" metric from single
      digit to 50-60% for some programs
    - support for user's global BPF subprogram arguments with few
      commonly requested annotations for a better developer experience
    - support tracking of BPF_JNE which helps cases when the compiler
      transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the
      like
    - several fixes
 
  - Add initial TX metadata implementation for AF_XDP with support in
    mlx5 and stmmac drivers. Two types of offloads are supported right
    now, that is, TX timestamp and TX checksum offload.
 
  - Fix kCFI bugs in BPF all forms of indirect calls from BPF into
    kernel and from kernel into BPF work with CFI enabled. This allows
    BPF to work with CONFIG_FINEIBT=y.
 
  - Change BPF verifier logic to validate global subprograms lazily
    instead of unconditionally before the main program, so they can be
    guarded using BPF CO-RE techniques.
 
  - Support uid/gid options when mounting bpffs.
 
  - Add a new kfunc which acquires the associated cgroup of a task
    within a specific cgroup v1 hierarchy where the latter is identified
    by its id.
 
  - Extend verifier to allow bpf_refcount_acquire() of a map value field
    obtained via direct load which is a use-case needed in sched_ext.
 
  - Add BPF link_info support for uprobe multi link along with bpftool
    integration for the latter.
 
  - Support for VLAN tag in XDP hints.
 
  - Remove deprecated bpfilter kernel leftovers given the project
    is developed in user-space (https://github.com/facebook/bpfilter).
 
 Misc
 ----
 
  - Support for parellel TC self-tests execution.
 
  - Increase MPTCP self-tests coverage.
 
  - Updated the bridge documentation, including several so-far
    undocumented features.
 
  - Convert all the net self-tests to run in unique netns, to
    avoid random failures due to conflict and allow concurrent
    runs.
 
  - Add TCP-AO self-tests.
 
  - Add kunit tests for both cfg80211 and mac80211.
 
  - Autogenerate Netlink families documentation from YAML spec.
 
  - Add yml-gen support for fixed headers and recursive nests, the
    tool can now generate user-space code for all genetlink families
    for which we have specs.
 
  - A bunch of additional module descriptions fixes.
 
  - Catch incorrect freeing of pages belonging to a page pool.
 
 Driver API
 ----------
 
  - Rust abstractions for network PHY drivers; do not cover yet the
    full C API, but already allow implementing functional PHY drivers
    in rust.
 
  - Introduce queue and NAPI support in the netdev Netlink interface,
    allowing complete access to the device <> NAPIs <> queues
    relationship.
 
  - Introduce notifications filtering for devlink to allow control
    application scale to thousands of instances.
 
  - Improve PHY validation, requesting rate matching information for
    each ethtool link mode supported by both the PHY and host.
 
  - Add support for ethtool symmetric-xor RSS hash.
 
  - ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD
    platform.
 
  - Expose pin fractional frequency offset value over new DPLL generic
    netlink attribute.
 
  - Convert older drivers to platform remove callback returning void.
 
  - Add support for PHY package MMD read/write.
 
 New hardware / drivers
 ----------------------
 
  - Ethernet:
    - Octeon CN10K devices
    - Broadcom 5760X P7
    - Qualcomm SM8550 SoC
    - Texas Instrument DP83TG720S PHY
 
  - Bluetooth:
    - IMC Networks Bluetooth radio
 
 Removed
 -------
 
  - WiFi:
    - libertas 16-bit PCMCIA support
    - Atmel at76c50x drivers
    - HostAP ISA/PCMCIA style 802.11b driver
    - zd1201 802.11b USB dongles
    - Orinoco ISA/PCMCIA 802.11b driver
    - Aviator/Raytheon driver
    - Planet WL3501 driver
    - RNDIS USB 802.11b driver
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Intel (100G, ice, idpf):
      - allow one by one port representors creation and removal
      - add temperature and clock information reporting
      - add get/set for ethtool's header split ringparam
      - add again FW logging
      - adds support switchdev hardware packet mirroring
      - iavf: implement symmetric-xor RSS hash
      - igc: add support for concurrent physical and free-running timers
      - i40e: increase the allowable descriptors
    - nVidia/Mellanox:
      - Preparation for Socket-Direct multi-dev netdev. That will allow
        in future releases combining multiple PFs devices attached to
        different NUMA nodes under the same netdev
    - Broadcom (bnxt):
      - TX completion handling improvements
      - add basic ntuple filter support
      - reduce MSIX vectors usage for MQPRIO offload
      - add VXLAN support, USO offload and TX coalesce completion for P7
    - Marvell Octeon EP:
      - xmit-more support
      - add PF-VF mailbox support and use it for FW notifications for VFs
    - Wangxun (ngbe/txgbe):
      - implement ethtool functions to operate pause param, ring param,
        coalesce channel number and msglevel
    - Netronome/Corigine (nfp):
      - add flow-steering support
      - support UDP segmentation offload
 
  - Ethernet NICs embedded, slower, virtual:
    - Xilinx AXI: remove duplicate DMA code adopting the dma engine driver
    - stmmac: add support for HW-accelerated VLAN stripping
    - TI AM654x sw: add mqprio, frame preemption & coalescing
    - gve: add support for non-4k page sizes.
    - virtio-net: support dynamic coalescing moderation
 
  - nVidia/Mellanox Ethernet datacenter switches:
    - allow firmware upgrade without a reboot
    - more flexible support for bridge flooding via the compressed
      FID flooding mode
 
  - Ethernet embedded switches:
    - Microchip:
      - fine-tune flow control and speed configurations in KSZ8xxx
      - KSZ88X3: enable setting rmii reference
    - Renesas:
      - add jumbo frames support
    - Marvell:
      - 88E6xxx: add "eth-mac" and "rmon" stats support
 
  - Ethernet PHYs:
    - aquantia: add firmware load support
    - at803x: refactor the driver to simplify adding support for more
      chip variants
    - NXP C45 TJA11xx: Add MACsec offload support
 
  - Wifi:
    - MediaTek (mt76):
      - NVMEM EEPROM improvements
      - mt7996 Extremely High Throughput (EHT) improvements
      - mt7996 Wireless Ethernet Dispatcher (WED) support
      - mt7996 36-bit DMA support
    - Qualcomm (ath12k):
      - support for a single MSI vector
      - WCN7850: support AP mode
    - Intel (iwlwifi):
      - new debugfs file fw_dbg_clear
      - allow concurrent P2P operation on DFS channels
 
  - Bluetooth:
    - QCA2066: support HFP offload
    - ISO: more broadcast-related improvements
    - NXP: better recovery in case receiver/transmitter get out of sync
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmWdamsSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkGC4P/2xjLzdw22ckSssuE9ORbGko9SNjnqHk
 PQh1E+26BHiCg5KB8VvzMsL78E79MRNXEattSW+1g7dhCvln3oi+Vd0WkdRkgt35
 98Iv18zLbbwFAJeyKvmLAPAkQkMLtVj19QILBBRrugF+egEZgVSE3JBcTAiKv2ZQ
 HzkabA171Ri6LpCcEEtY5XuaKvimGnGzF8YMFf8rX0wtqd2p5kbY9aMe47WAGxvU
 Vf9548XvH+A5yVH2/4/gujtUOpA/RHuhuCMb+oo0cZ+VCC1x9MGzoXzj6r87OTkf
 k2W1whNzcGoin92f+9Lk1JYMuiGKBH4QVaDdNXJnYFSJWPTE7RvRsPzYTSD4/GzK
 yEZbzSJXpy/2vDQm16NoAxl7evRs8Sorzkw4LQRviZHI/5SAkK2ZQiCK5CO8QSYy
 C1LELcV5kn6Foe24xWnrWLjAGug9oJnYoGPMU5gvPmFJMvUMXqm5rmbBgUWL5Rxw
 q1M6gVzabCyWUy6z2G2vaqW2ZntNVvCkdsLtIX0XZkcTzNoP0MA+TuhyGz4wbiuo
 PeyQp/mbGnDgCYggqKIA0YWrTVxkhFrKN520cbO8qXBQytV9oFbM/0/+C0/r/5WX
 pL1JVzLrh6l5ME7EIQfha8UOF9j8q4ueSwb40P3AR2NaZiDABM0zfUZ6+sx+91WF
 ucqPEcZB5cRE
 =1bW6
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Paolo Abeni:
 "The most interesting thing is probably the networking structs
  reorganization and a significant amount of changes is around
  self-tests.

  Core & protocols:

   - Analyze and reorganize core networking structs (socks, netdev,
     netns, mibs) to optimize cacheline consumption and set up build
     time warnings to safeguard against future header changes

     This improves TCP performances with many concurrent connections up
     to 40%

   - Add page-pool netlink-based introspection, exposing the memory
     usage and recycling stats. This helps indentify bad PP users and
     possible leaks

   - Refine TCP/DCCP source port selection to no longer favor even
     source port at connect() time when IP_LOCAL_PORT_RANGE is set. This
     lowers the time taken by connect() for hosts having many active
     connections to the same destination

   - Refactor the TCP bind conflict code, shrinking related socket
     structs

   - Refactor TCP SYN-Cookie handling, as a preparation step to allow
     arbitrary SYN-Cookie processing via eBPF

   - Tune optmem_max for 0-copy usage, increasing the default value to
     128KB and namespecifying it

   - Allow coalescing for cloned skbs coming from page pools, improving
     RX performances with some common configurations

   - Reduce extension header parsing overhead at GRO time

   - Add bridge MDB bulk deletion support, allowing user-space to
     request the deletion of matching entries

   - Reorder nftables struct members, to keep data accessed by the
     datapath first

   - Introduce TC block ports tracking and use. This allows supporting
     multicast-like behavior at the TC layer

   - Remove UAPI support for retired TC qdiscs (dsmark, CBQ and ATM) and
     classifiers (RSVP and tcindex)

   - More data-race annotations

   - Extend the diag interface to dump TCP bound-only sockets

   - Conditional notification of events for TC qdisc class and actions

   - Support for WPAN dynamic associations with nearby devices, to form
     a sub-network using a specific PAN ID

   - Implement SMCv2.1 virtual ISM device support

   - Add support for Batman-avd mulicast packet type

  BPF:

   - Tons of verifier improvements:
       - BPF register bounds logic and range support along with a large
         test suite
       - log improvements
       - complete precision tracking support for register spills
       - track aligned STACK_ZERO cases as imprecise spilled registers.
         This improves the verifier "instructions processed" metric from
         single digit to 50-60% for some programs
       - support for user's global BPF subprogram arguments with few
         commonly requested annotations for a better developer
         experience
       - support tracking of BPF_JNE which helps cases when the compiler
         transforms (unsigned) "a > 0" into "if a == 0 goto xxx" and the
         like
       - several fixes

   - Add initial TX metadata implementation for AF_XDP with support in
     mlx5 and stmmac drivers. Two types of offloads are supported right
     now, that is, TX timestamp and TX checksum offload

   - Fix kCFI bugs in BPF all forms of indirect calls from BPF into
     kernel and from kernel into BPF work with CFI enabled. This allows
     BPF to work with CONFIG_FINEIBT=y

   - Change BPF verifier logic to validate global subprograms lazily
     instead of unconditionally before the main program, so they can be
     guarded using BPF CO-RE techniques

   - Support uid/gid options when mounting bpffs

   - Add a new kfunc which acquires the associated cgroup of a task
     within a specific cgroup v1 hierarchy where the latter is
     identified by its id

   - Extend verifier to allow bpf_refcount_acquire() of a map value
     field obtained via direct load which is a use-case needed in
     sched_ext

   - Add BPF link_info support for uprobe multi link along with bpftool
     integration for the latter

   - Support for VLAN tag in XDP hints

   - Remove deprecated bpfilter kernel leftovers given the project is
     developed in user-space (https://github.com/facebook/bpfilter)

  Misc:

   - Support for parellel TC self-tests execution

   - Increase MPTCP self-tests coverage

   - Updated the bridge documentation, including several so-far
     undocumented features

   - Convert all the net self-tests to run in unique netns, to avoid
     random failures due to conflict and allow concurrent runs

   - Add TCP-AO self-tests

   - Add kunit tests for both cfg80211 and mac80211

   - Autogenerate Netlink families documentation from YAML spec

   - Add yml-gen support for fixed headers and recursive nests, the tool
     can now generate user-space code for all genetlink families for
     which we have specs

   - A bunch of additional module descriptions fixes

   - Catch incorrect freeing of pages belonging to a page pool

  Driver API:

   - Rust abstractions for network PHY drivers; do not cover yet the
     full C API, but already allow implementing functional PHY drivers
     in rust

   - Introduce queue and NAPI support in the netdev Netlink interface,
     allowing complete access to the device <> NAPIs <> queues
     relationship

   - Introduce notifications filtering for devlink to allow control
     application scale to thousands of instances

   - Improve PHY validation, requesting rate matching information for
     each ethtool link mode supported by both the PHY and host

   - Add support for ethtool symmetric-xor RSS hash

   - ACPI based Wifi band RFI (WBRF) mitigation feature for the AMD
     platform

   - Expose pin fractional frequency offset value over new DPLL generic
     netlink attribute

   - Convert older drivers to platform remove callback returning void

   - Add support for PHY package MMD read/write

  New hardware / drivers:

   - Ethernet:
       - Octeon CN10K devices
       - Broadcom 5760X P7
       - Qualcomm SM8550 SoC
       - Texas Instrument DP83TG720S PHY

   - Bluetooth:
       - IMC Networks Bluetooth radio

  Removed:

   - WiFi:
       - libertas 16-bit PCMCIA support
       - Atmel at76c50x drivers
       - HostAP ISA/PCMCIA style 802.11b driver
       - zd1201 802.11b USB dongles
       - Orinoco ISA/PCMCIA 802.11b driver
       - Aviator/Raytheon driver
       - Planet WL3501 driver
       - RNDIS USB 802.11b driver

  Driver updates:

   - Ethernet high-speed NICs:
       - Intel (100G, ice, idpf):
          - allow one by one port representors creation and removal
          - add temperature and clock information reporting
          - add get/set for ethtool's header split ringparam
          - add again FW logging
          - adds support switchdev hardware packet mirroring
          - iavf: implement symmetric-xor RSS hash
          - igc: add support for concurrent physical and free-running
            timers
          - i40e: increase the allowable descriptors
       - nVidia/Mellanox:
          - Preparation for Socket-Direct multi-dev netdev. That will
            allow in future releases combining multiple PFs devices
            attached to different NUMA nodes under the same netdev
       - Broadcom (bnxt):
          - TX completion handling improvements
          - add basic ntuple filter support
          - reduce MSIX vectors usage for MQPRIO offload
          - add VXLAN support, USO offload and TX coalesce completion
            for P7
       - Marvell Octeon EP:
          - xmit-more support
          - add PF-VF mailbox support and use it for FW notifications
            for VFs
       - Wangxun (ngbe/txgbe):
          - implement ethtool functions to operate pause param, ring
            param, coalesce channel number and msglevel
       - Netronome/Corigine (nfp):
          - add flow-steering support
          - support UDP segmentation offload

   - Ethernet NICs embedded, slower, virtual:
       - Xilinx AXI: remove duplicate DMA code adopting the dma engine
         driver
       - stmmac: add support for HW-accelerated VLAN stripping
       - TI AM654x sw: add mqprio, frame preemption & coalescing
       - gve: add support for non-4k page sizes.
       - virtio-net: support dynamic coalescing moderation

   - nVidia/Mellanox Ethernet datacenter switches:
       - allow firmware upgrade without a reboot
       - more flexible support for bridge flooding via the compressed
         FID flooding mode

   - Ethernet embedded switches:
       - Microchip:
          - fine-tune flow control and speed configurations in KSZ8xxx
          - KSZ88X3: enable setting rmii reference
       - Renesas:
          - add jumbo frames support
       - Marvell:
          - 88E6xxx: add "eth-mac" and "rmon" stats support

   - Ethernet PHYs:
       - aquantia: add firmware load support
       - at803x: refactor the driver to simplify adding support for more
         chip variants
       - NXP C45 TJA11xx: Add MACsec offload support

   - Wifi:
       - MediaTek (mt76):
          - NVMEM EEPROM improvements
          - mt7996 Extremely High Throughput (EHT) improvements
          - mt7996 Wireless Ethernet Dispatcher (WED) support
          - mt7996 36-bit DMA support
       - Qualcomm (ath12k):
          - support for a single MSI vector
          - WCN7850: support AP mode
       - Intel (iwlwifi):
          - new debugfs file fw_dbg_clear
          - allow concurrent P2P operation on DFS channels

   - Bluetooth:
       - QCA2066: support HFP offload
       - ISO: more broadcast-related improvements
       - NXP: better recovery in case receiver/transmitter get out of sync"

* tag 'net-next-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1714 commits)
  lan78xx: remove redundant statement in lan78xx_get_eee
  lan743x: remove redundant statement in lan743x_ethtool_get_eee
  bnxt_en: Fix RCU locking for ntuple filters in bnxt_rx_flow_steer()
  bnxt_en: Fix RCU locking for ntuple filters in bnxt_srxclsrldel()
  bnxt_en: Remove unneeded variable in bnxt_hwrm_clear_vnic_filter()
  tcp: Revert no longer abort SYN_SENT when receiving some ICMP
  Revert "mlx5 updates 2023-12-20"
  Revert "net: stmmac: Enable Per DMA Channel interrupt"
  ipvlan: Remove usage of the deprecated ida_simple_xx() API
  ipvlan: Fix a typo in a comment
  net/sched: Remove ipt action tests
  net: stmmac: Use interrupt mode INTM=1 for per channel irq
  net: stmmac: Add support for TX/RX channel interrupt
  net: stmmac: Make MSI interrupt routine generic
  dt-bindings: net: snps,dwmac: per channel irq
  net: phy: at803x: make read_status more generic
  net: phy: at803x: add support for cdt cross short test for qca808x
  net: phy: at803x: refactor qca808x cable test get status function
  net: phy: at803x: generalize cdt fault length function
  net: ethernet: cortina: Drop TSO support
  ...
2024-01-11 10:07:29 -08:00
Linus Torvalds
de927f6c0b s390 updates for 6.8 merge window
- Add machine variable capacity information to /proc/sysinfo.
 
 - Limit the waste of page tables and always align vmalloc area size
   and base address on segment boundary.
 
 - Fix a memory leak when an attempt to register interruption sub class
   (ISC) for the adjunct-processor (AP) guest failed.
 
 - Reset response code AP_RESPONSE_INVALID_GISA to understandable
   by guest AP_RESPONSE_INVALID_ADDRESS in response to a failed
   interruption sub class (ISC) registration attempt.
 
 - Improve reaction to adjunct-processor (AP) AP_RESPONSE_OTHERWISE_CHANGED
   response code when enabling interrupts on behalf of a guest.
 
 - Fix incorrect sysfs 'status' attribute of adjunct-processor (AP) queue
   device bound to the vfio_ap device driver when the mediated device is
   attached to a guest, but the queue device is not passed through.
 
 - Rework struct ap_card to hold the whole adjunct-processor (AP) card
   hardware information. As result, all the ugly bit checks are replaced
   by simple evaluations of the required bit fields.
 
 - Improve handling of some weird scenarios between service element (SE)
   host and SE guest with adjunct-processor (AP) pass-through support.
 
 - Change local_ctl_set_bit() and local_ctl_clear_bit() so they return the
   previous value of the to be changed control register. This is useful if
   a bit is only changed temporarily and the previous content needs to be
   restored.
 
 - The kernel starts with machine checks disabled and is expected to enable
   it once trap_init() is called. However the implementation allows machine
   checks early. Consistently enable it in trap_init() only.
 
 - local_mcck_disable() and local_mcck_enable() assume that machine checks
   are always enabled. Instead implement and use local_mcck_save() and
   local_mcck_restore() to disable machine checks and restore the previous
   state.
 
 - Modification of floating point control (FPC) register of a traced
   process using ptrace interface may lead to corruption of the FPC
   register of the tracing process. Fix this.
 
 - kvm_arch_vcpu_ioctl_set_fpu() allows to set the floating point control
   (FPC) register in vCPU, but may lead to corruption of the FPC register
   of the host process. Fix this.
 
 - Use READ_ONCE() to read a vCPU floating point register value from the
   memory mapped area. This avoids that, depending on code generation,
   a different value is tested for validity than the one that is used.
 
 - Get rid of test_fp_ctl(), since it is quite subtle to use it correctly.
   Instead copy a new floating point control register value into its save
   area and test the validity of the new value when loading it.
 
 - Remove superfluous save_fpu_regs() call.
 
 - Remove s390 support for ARCH_WANTS_DYNAMIC_TASK_STRUCT. All machines
   provide the vector facility since many years and the need to make the
   task structure size dependent on the vector facility does not exist.
 
 - Remove the "novx" kernel command line option, as the vector code runs
   without any problems since many years.
 
 - Add the vector facility to the z13 architecture level set (ALS).
   All hypervisors support the vector facility since many years.
   This allows compile time optimizations of the kernel.
 
 - Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx(). As result,
   the compiled code will have less runtime checks and less code.
 
 - Convert pgste_get_lock() and pgste_set_unlock() ASM inlines to C.
 
 - Convert the struct subchannel spinlock from pointer to member.
 -----BEGIN PGP SIGNATURE-----
 
 iI0EABYIADUWIQQrtrZiYVkVzKQcYivNdxKlNrRb8AUCZZxnchccYWdvcmRlZXZA
 bGludXguaWJtLmNvbQAKCRDNdxKlNrRb8PyGAP9Z0Z1Pm3hPf24M4FgY5uuRqBHo
 VoiuohOZRGONwifnsgEAmN/3ba/7d/j0iMwpUdUFouiqtwTUihh15wYx1sH2IA8=
 =F6YD
 -----END PGP SIGNATURE-----

Merge tag 's390-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Alexander Gordeev:

 - Add machine variable capacity information to /proc/sysinfo.

 - Limit the waste of page tables and always align vmalloc area size and
   base address on segment boundary.

 - Fix a memory leak when an attempt to register interruption sub class
   (ISC) for the adjunct-processor (AP) guest failed.

 - Reset response code AP_RESPONSE_INVALID_GISA to understandable by
   guest AP_RESPONSE_INVALID_ADDRESS in response to a failed
   interruption sub class (ISC) registration attempt.

 - Improve reaction to adjunct-processor (AP)
   AP_RESPONSE_OTHERWISE_CHANGED response code when enabling interrupts
   on behalf of a guest.

 - Fix incorrect sysfs 'status' attribute of adjunct-processor (AP)
   queue device bound to the vfio_ap device driver when the mediated
   device is attached to a guest, but the queue device is not passed
   through.

 - Rework struct ap_card to hold the whole adjunct-processor (AP) card
   hardware information. As result, all the ugly bit checks are replaced
   by simple evaluations of the required bit fields.

 - Improve handling of some weird scenarios between service element (SE)
   host and SE guest with adjunct-processor (AP) pass-through support.

 - Change local_ctl_set_bit() and local_ctl_clear_bit() so they return
   the previous value of the to be changed control register. This is
   useful if a bit is only changed temporarily and the previous content
   needs to be restored.

 - The kernel starts with machine checks disabled and is expected to
   enable it once trap_init() is called. However the implementation
   allows machine checks early. Consistently enable it in trap_init()
   only.

 - local_mcck_disable() and local_mcck_enable() assume that machine
   checks are always enabled. Instead implement and use
   local_mcck_save() and local_mcck_restore() to disable machine checks
   and restore the previous state.

 - Modification of floating point control (FPC) register of a traced
   process using ptrace interface may lead to corruption of the FPC
   register of the tracing process. Fix this.

 - kvm_arch_vcpu_ioctl_set_fpu() allows to set the floating point
   control (FPC) register in vCPU, but may lead to corruption of the FPC
   register of the host process. Fix this.

 - Use READ_ONCE() to read a vCPU floating point register value from the
   memory mapped area. This avoids that, depending on code generation, a
   different value is tested for validity than the one that is used.

 - Get rid of test_fp_ctl(), since it is quite subtle to use it
   correctly. Instead copy a new floating point control register value
   into its save area and test the validity of the new value when
   loading it.

 - Remove superfluous save_fpu_regs() call.

 - Remove s390 support for ARCH_WANTS_DYNAMIC_TASK_STRUCT. All machines
   provide the vector facility since many years and the need to make the
   task structure size dependent on the vector facility does not exist.

 - Remove the "novx" kernel command line option, as the vector code runs
   without any problems since many years.

 - Add the vector facility to the z13 architecture level set (ALS). All
   hypervisors support the vector facility since many years. This allows
   compile time optimizations of the kernel.

 - Get rid of MACHINE_HAS_VX and replace it with cpu_has_vx(). As
   result, the compiled code will have less runtime checks and less
   code.

 - Convert pgste_get_lock() and pgste_set_unlock() ASM inlines to C.

 - Convert the struct subchannel spinlock from pointer to member.

* tag 's390-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (24 commits)
  Revert "s390: update defconfigs"
  s390/cio: make sch->lock spinlock pointer a member
  s390: update defconfigs
  s390/mm: convert pgste locking functions to C
  s390/fpu: get rid of MACHINE_HAS_VX
  s390/als: add vector facility to z13 architecture level set
  s390/fpu: remove "novx" option
  s390/fpu: remove ARCH_WANTS_DYNAMIC_TASK_STRUCT support
  KVM: s390: remove superfluous save_fpu_regs() call
  s390/fpu: get rid of test_fp_ctl()
  KVM: s390: use READ_ONCE() to read fpc register value
  KVM: s390: fix setting of fpc register
  s390/ptrace: handle setting of fpc register correctly
  s390/nmi: implement and use local_mcck_save() / local_mcck_restore()
  s390/nmi: consistently enable machine checks in trap_init()
  s390/ctlreg: return old register contents when changing bits
  s390/ap: handle outband SE bind state change
  s390/ap: store TAPQ hwinfo in struct ap_card
  s390/vfio-ap: fix sysfs status attribute for AP queue devices
  s390/vfio-ap: improve reaction to response code 07 from PQAP(AQIC) command
  ...
2024-01-10 18:18:20 -08:00
Linus Torvalds
c299010061 asm-generic cleanups for 6.8
A series from Baoquan He cleans up the asm-generic/io.h to remove the
 ioremap_uc() definition from everything except x86, which still needs it
 for pre-PAT systems. This series notably contains a patch from Jiaxun Yang
 that converts MIPS to use asm-generic/io.h like every other architecture
 does, enabling future cleanups.
 
 Some of my own patches fix -Wmissing-prototype warnings in architecture
 specific code across several architectures. This is now needed as the
 warning is enabled by default. There are still some remaining warnings
 in minor platforms, but the series should catch most of the widely used
 ones make them more consistent with one another.
 
 David McKay fixes a bug in __generic_cmpxchg_local() when this is used
 on 64-bit architectures. This could currently only affect parisc64
 and sparc64.
 
 Additional cleanups address from Linus Walleij, Uwe Kleine-König,
 Thomas Huth, and Kefeng Wang help reduce unnecessary inconsistencies
 between architectures.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmWeak8ACgkQYKtH/8kJ
 UidSiQ/+LL1WTO9d3Zx5HI0GGGjaIYpYs6jUNSf9Y5GPQiOrvjfEWj7CU11/4vxl
 GlQRpRyncYm8Eiz0Qu+aNxZFiiMah8Uful75yfbX8P1L4EPTbAYNDjkyNJrTjIAK
 jPK4sl8awIrapOeFUz++PsEj22R/4Is4f0mo+CqoCkL5RKlHe5oFdXzcwjmds4yK
 CvU6Ldn+M7FZ3EItMdjXaB3D3HS9uictFiO5JByZY8p+IcqgNRI/iHNnZIMsltJ+
 XjDi0DG+x4jCj6teElSchw7AofE4OcNSP3xbR1PLKv6+xBLGYaAGZhNuPTz88eV/
 Gj0loDQrrR5McGUfDBRHK9zN2Jd0O/FKnfh9kLOt1FLFyGPvC78Q/2HkpVCjbBr2
 Pr1aqhLDHA+tGNSsThsV8RUa8/tiEnxAki43tfBFS3SEKhtQsTm2g1z4miwbE3p0
 BJIrSgTqrP/SBq7a9z/thPrkzdZcNuA9FUETTbaMeUlJS51n1V9E5A1t7sOG7jaI
 vV/gbuR6FjvD49mTyQiOSCt3V4ygRqgN1Q+C4QM8WLqq2keUq0AhGodquv8F78in
 J3x2j2r27lHY7jKf8B0dua/JXAsF20u8qD6yDQ9ymkjt/MWhGXBgK0jpT7RTIuMS
 e2jmTywUVD4UohAcx3inkOojUhIJ5KDB0I4Pzv4zWcHNbyFNKcY=
 =4VQl
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic cleanups from Arnd Bergmann:
 "A series from Baoquan He cleans up the asm-generic/io.h to remove the
  ioremap_uc() definition from everything except x86, which still needs
  it for pre-PAT systems. This series notably contains a patch from
  Jiaxun Yang that converts MIPS to use asm-generic/io.h like every
  other architecture does, enabling future cleanups.

  Some of my own patches fix -Wmissing-prototype warnings in
  architecture specific code across several architectures. This is now
  needed as the warning is enabled by default. There are still some
  remaining warnings in minor platforms, but the series should catch
  most of the widely used ones make them more consistent with one
  another.

  David McKay fixes a bug in __generic_cmpxchg_local() when this is used
  on 64-bit architectures. This could currently only affect parisc64 and
  sparc64.

  Additional cleanups address from Linus Walleij, Uwe Kleine-König,
  Thomas Huth, and Kefeng Wang help reduce unnecessary inconsistencies
  between architectures"

* tag 'asm-generic-6.8' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  asm-generic: Fix 32 bit __generic_cmpxchg_local
  Hexagon: Make pfn accessors statics inlines
  ARC: mm: Make virt_to_pfn() a static inline
  mips: remove extraneous asm-generic/iomap.h include
  sparc: Use $(kecho) to announce kernel images being ready
  arm64: vdso32: Define BUILD_VDSO32_64 to correct prototypes
  csky: fix arch_jump_label_transform_static override
  arch: add do_page_fault prototypes
  arch: add missing prepare_ftrace_return() prototypes
  arch: vdso: consolidate gettime prototypes
  arch: include linux/cpu.h for trap_init() prototype
  arch: fix asm-offsets.c building with -Wmissing-prototypes
  arch: consolidate arch_irq_work_raise prototypes
  hexagon: Remove CONFIG_HEXAGON_ARCH_VERSION from uapi header
  asm/io: remove unnecessary xlate_dev_mem_ptr() and unxlate_dev_mem_ptr()
  mips: io: remove duplicated codes
  arch/*/io.h: remove ioremap_uc in some architectures
  mips: add <asm-generic/io.h> including
2024-01-10 18:13:44 -08:00
Linus Torvalds
4cd083d531 Modules changes for v6.8-rc1
Just one cleanup and one documentation improvement change. No functional
 changes. However, this has been tested on linux-next for over 1 month.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmWdW0USHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinL5gQAMUtU4nv+HahLwJK3ha5fRmcHm2kV9Q4
 g9X7JUscEe+mNYLNy2Kjpl/BWatEHWn8jQD2nsMQJOcEkX2Mf5tfqBR3561wrfSZ
 dnMmNDG7Ym6Y9kOSDz6cpCxsu8Xm5Dj9MLKJ51qPfyXgeobD8IiKBe2oCDqcKzGY
 ZpDnmpUaYCOIloNhJNK9ybrNLsVDQwdPiC8vVQXULC4ePBw3i+mnh8c1wr442wEO
 G6xfi0wNXIoB9S8ynzakW2lJPD1XeMQYu/SJR0nz61KhJxAjs/LVAt9k7itcnUgG
 Zbc1Fn944oWoe7/ywwDIxstR56NYVpcXoTJexeHxrLe6PiEmRbh6phrBMUCOpUsh
 0DrHJE8z4dsHovo6w6m1zvMF6FphLHhUU6L/opBwrUJ5CGrYfegG95v8d3jRSphe
 GSoMo9iGHvr0PgY7OcG77m5NJjrFwwbPO88Fe3IAXmOXIrKYuzBoUnt3Te1dw8vX
 6vYPUUZ3HLDOACWtKf2Tjhr7pM6b1C72vrg7uYc2560BH6ERqjAtuOWKrx8QOeeo
 SUT4ACs4qa3DrPz7zpNhwwnfwpFoVuJd+fooMV/WhOU9KsCVCokC7zZub+J8UD7m
 1j1nhWkKaF08c0BKW4zRMmUVSV6Dh/AO0YybetBq9b4o7NPEcij2HlIk+jJa0VMJ
 XSx5UcuxRyf7
 =hvRa
 -----END PGP SIGNATURE-----

Merge tag 'modules-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull module updates from Luis Chamberlain:
 "Just one cleanup and one documentation improvement change. No
  functional changes"

* tag 'modules-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  kernel/module: improve documentation for try_module_get()
  module: Remove redundant TASK_UNINTERRUPTIBLE
2024-01-10 18:00:18 -08:00
Linus Torvalds
a05aea98d4 sysctl-6.8-rc1
To help make the move of sysctls out of kernel/sysctl.c not incur a size
 penalty sysctl has been changed to allow us to not require the sentinel, the
 final empty element on the sysctl array. Joel Granados has been doing all this
 work. On the v6.6 kernel we got the major infrastructure changes required to
 support this. For v6.7 we had all arch/ and drivers/ modified to remove
 the sentinel. For v6.8-rc1 we get a few more updates for fs/ directory only.
 The kernel/ directory is left but we'll save that for v6.9-rc1 as those patches
 are still being reviewed. After that we then can expect also the removal of the
 no longer needed check for procname == NULL.
 
 Let us recap the purpose of this work:
 
   - this helps reduce the overall build time size of the kernel and run time
     memory consumed by the kernel by about ~64 bytes per array
   - the extra 64-byte penalty is no longer inncurred now when we move sysctls
     out from kernel/sysctl.c to their own files
 
 Thomas Weißschuh also sent a few cleanups, for v6.9-rc1 we expect to see further
 work by Thomas Weißschuh with the constificatin of the struct ctl_table.
 
 Due to Joel Granados's work, and to help bring in new blood, I have suggested
 for him to become a maintainer and he's accepted. So for v6.9-rc1 I look forward
 to seeing him sent you a pull request for further sysctl changes. This also
 removes Iurii Zaikin as a maintainer as he has moved on to other projects and
 has had no time to help at all.
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCgAwFiEENnNq2KuOejlQLZofziMdCjCSiKcFAmWdWDESHG1jZ3JvZkBr
 ZXJuZWwub3JnAAoJEM4jHQowkoinjJAP/jTNNoyzWisvrrvmXqR5txFGLOE+wW6x
 Xv9avuiM+DTHsH/wK8CkXEivwDqYNAZEHU7NEcolS5bJX/ddSRwN9b5aSVlCrUdX
 Ab4rXmpeSCNFp9zNszWJsDuBKIqjvsKw7qGleGtgZ2qAUHbbH30VROLWCggaee50
 wU3icDLdwkasxrcMXy4Sq5dT5wYC4j/QelqBGIkYPT14Arl1im5zqPZ95gmO/s/6
 mdicTAmq+hhAUfUBJBXRKtsvxY6CItxe55Q4fjpncLUJLHUw+VPVNoBKFWJlBwlh
 LO3liKFfakPSkil4/en+/+zuMByd0JBkIzIJa+Kk5kjpbHRhK0RkmU4+Y5G5spWN
 jjLfiv6RxInNaZ8EWQBMfjE95A7PmYDQ4TOH08+OvzdDIi6B0BB5tBGQpG9BnyXk
 YsLg1Uo4CwE/vn1/a9w0rhadjUInvmAryhb/uSJYFz/lmApLm2JUpY3/KstwGetb
 z+HmLstJb24Djkr6pH8DcjhzRBHeWQ5p0b4/6B+v1HqAUuEhdbyw1F2GrDywyF3R
 h/UOAaKLm1+ffdA246o9TejKiDU96qEzzXMaCzPKyestaRZuiyuYEMDhYbvtsMV5
 zIdMJj5HQ+U1KHDv4IN99DEj7+/vjE3f4Sjo+POFpQeQ8/d+fxpFNqXVv449dgnb
 6xEkkxsR0ElM
 =2qBt
 -----END PGP SIGNATURE-----

Merge tag 'sysctl-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux

Pull sysctl updates from Luis Chamberlain:
 "To help make the move of sysctls out of kernel/sysctl.c not incur a
  size penalty sysctl has been changed to allow us to not require the
  sentinel, the final empty element on the sysctl array. Joel Granados
  has been doing all this work.

  In the v6.6 kernel we got the major infrastructure changes required to
  support this. For v6.7 we had all arch/ and drivers/ modified to
  remove the sentinel. For v6.8-rc1 we get a few more updates for fs/
  directory only.

  The kernel/ directory is left but we'll save that for v6.9-rc1 as
  those patches are still being reviewed. After that we then can expect
  also the removal of the no longer needed check for procname == NULL.

  Let us recap the purpose of this work:

   - this helps reduce the overall build time size of the kernel and run
     time memory consumed by the kernel by about ~64 bytes per array

   - the extra 64-byte penalty is no longer inncurred now when we move
     sysctls out from kernel/sysctl.c to their own files

  Thomas Weißschuh also sent a few cleanups, for v6.9-rc1 we expect to
  see further work by Thomas Weißschuh with the constificatin of the
  struct ctl_table.

  Due to Joel Granados's work, and to help bring in new blood, I have
  suggested for him to become a maintainer and he's accepted. So for
  v6.9-rc1 I look forward to seeing him sent you a pull request for
  further sysctl changes. This also removes Iurii Zaikin as a maintainer
  as he has moved on to other projects and has had no time to help at
  all"

* tag 'sysctl-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux:
  sysctl: remove struct ctl_path
  sysctl: delete unused define SYSCTL_PERM_EMPTY_DIR
  coda: Remove the now superfluous sentinel elements from ctl_table array
  sysctl: Remove the now superfluous sentinel elements from ctl_table array
  fs: Remove the now superfluous sentinel elements from ctl_table array
  cachefiles: Remove the now superfluous sentinel element from ctl_table array
  sysclt: Clarify the results of selftest run
  sysctl: Add a selftest for handling empty dirs
  sysctl: Fix out of bounds access for empty sysctl registers
  MAINTAINERS: Add Joel Granados as co-maintainer for proc sysctl
  MAINTAINERS: remove Iurii Zaikin from proc sysctl
2024-01-10 17:44:36 -08:00
Linus Torvalds
78273df7f6 header cleanups for 6.8
The goal is to get sched.h down to a type only header, so the main thing
 happening in this patchset is splitting out various _types.h headers and
 dependency fixups, as well as moving some things out of sched.h to
 better locations.
 
 This is prep work for the memory allocation profiling patchset which
 adds new sched.h interdepencencies.
 
 Testing - it's been in -next, and fixes from pretty much all
 architectures have percolated in - nothing major.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmWfBwwACgkQE6szbY3K
 bnZPwBAAmuRojXaeWxi01IPIOehSGDe68vw44PR9glEMZvxdnZuPOdvE4/+245/L
 bRKU2WBCjBUokUbV9msIShwRkFTZAmEMPNfPAAsFMA+VXeDYHKB+ZRdwTggNAQ+I
 SG6fZgh5m0HsewCDxU8oqVHkjVq4fXn0cy+aL6xLEd9gu67GoBzX2pDieS2Kvy6j
 jnyoKTxFwb+LTQgph0P4EIpq5I2umAsdLwdSR8EJ+8e9NiNvMo1pI00Lx/ntAnFZ
 JftWUJcMy3TQ5u1GkyfQN9y/yThX1bZK5GvmHS9SJ2Dkacaus5d+xaKCHtRuFS1I
 7C6b8PsNgRczUMumBXus44HdlNfNs1yU3lvVxFvBIPE1qC9pYRHrkWIXXIocXLLC
 oxTEJ6B2G3BQZVQgLIA4fOaxMVhmvKffi/aEZLi9vN9VVosd1a6XNKI6KbyRnXFp
 GSs9qDqszhn5I3GYNlDNQTc/8UsRlhPFgS6nS0By6QnvxtGi9QkU2tBRBsXvqwCy
 cLoCYIhc2tvugHvld70dz26umiJ4rnmxGlobStNoigDvIKAIUt1UmIdr1so8P8eH
 xehnL9ZcOX6xnANDL0AqMFFHV6I58CJynhFdUoXfVQf/DWLGX48mpi9LVNsYBzsI
 CAwVOAQ0UjGrpdWmJ9ueY/ABYqg9vRjzaDEXQ+MhAYO55CLaVsg=
 =3tyT
 -----END PGP SIGNATURE-----

Merge tag 'header_cleanup-2024-01-10' of https://evilpiepirate.org/git/bcachefs

Pull header cleanups from Kent Overstreet:
 "The goal is to get sched.h down to a type only header, so the main
  thing happening in this patchset is splitting out various _types.h
  headers and dependency fixups, as well as moving some things out of
  sched.h to better locations.

  This is prep work for the memory allocation profiling patchset which
  adds new sched.h interdepencencies"

* tag 'header_cleanup-2024-01-10' of https://evilpiepirate.org/git/bcachefs: (51 commits)
  Kill sched.h dependency on rcupdate.h
  kill unnecessary thread_info.h include
  Kill unnecessary kernel.h include
  preempt.h: Kill dependency on list.h
  rseq: Split out rseq.h from sched.h
  LoongArch: signal.c: add header file to fix build error
  restart_block: Trim includes
  lockdep: move held_lock to lockdep_types.h
  sem: Split out sem_types.h
  uidgid: Split out uidgid_types.h
  seccomp: Split out seccomp_types.h
  refcount: Split out refcount_types.h
  uapi/linux/resource.h: fix include
  x86/signal: kill dependency on time.h
  syscall_user_dispatch.h: split out *_types.h
  mm_types_task.h: Trim dependencies
  Split out irqflags_types.h
  ipc: Kill bogus dependency on spinlock.h
  shm: Slim down dependencies
  workqueue: Split out workqueue_types.h
  ...
2024-01-10 16:43:55 -08:00
Linus Torvalds
999a36b52b bcachefs updates for 6.8:
- btree write buffer rewrite: instead of adding keys to the btree write
    buffer at transaction commit time, we know journal them with a
    different journal entry type and copy them from the journal to the
    write buffer just prior to journal write.
 
    This reduces the number of atomic operations on shared cachelines
    in the transaction commit path and is a signicant performance
    improvement on some workloads: multithreaded 4k random writes went
    from ~650k iops to ~850k iops.
 
  - Bring back optimistic spinning for six locks: the new implementation
    doesn't use osq locks; instead we add to the lock waitlist as normal,
    and then spin on the lock_acquired bit in the waitlist entry, _not_
    the lock itself.
 
  - BCH_IOCTL_DEV_USAGE_V2, which allows for new data types
  - BCH_IOCTL_OFFLINE_FSCK, which runs the kernel implementation of fsck
    but without mounting: useful for transparently using the kernel
    version of fsck from 'bcachefs fsck' when the kernel version is a
    better match for the on disk filesystem.
 
  - BCH_IOCTL_ONLINE_FSCK: online fsck. Not all passes are supported yet,
    but the passes that are supported are fully featured - errors may be
    corrected as normal.
 
    The new ioctls use the new 'thread_with_file' abstraction for kicking
    off a kthread that's tied to a file descriptor returned to userspace
    via the ioctl.
 
  - btree_paths within a btree_trans are now dynamically growable,
    instead of being limited to 64. This is important for the
    check_directory_structure phase of fsck, and also fixes some issues
    we were having with btree path overflow in the reflink btree.
 
  - Trigger refactoring; prep work for the upcoming disk space accounting
    rewrite
 
  - Numerous bugfixes :)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKnAFLkS8Qha+jvQrE6szbY3KbnYFAmWe8PUACgkQE6szbY3K
 bnYw6g/9GAXfIGasTZZwK2XEr36RYtEFYMwd/m9V1ET0DH6d/MFH9G7tTYl52AQ4
 k9cDFb0d2qdtNk2Rlml1lHFrxMzkp2Q7j9S4YcETrE+/Dir8ODVcJXrGeNTCMGmz
 B+C12mTOpWrzGMrioRgFZjWAnacsY3RP8NFRTT9HIJHO9UCP+xN5y++sX10C5Gwv
 7UVWTaUwjkgdYWkR8RCKGXuG5cNNlRp4Y0eeK2XruG1iI9VAilir1glcD/YMOY8M
 vECQzmf2ZLGFS/tpnmqVhNbNwVWpTQMYassvKaisWNHLDUgskOoF8YfoYSH27t7F
 GBb1154O2ga6ea866677FDeNVlg386mGCTUy2xOhMpDL3zW+/Is+8MdfJI4MJP5R
 EwcjHnn2bk0C2kULbAohw0gnU42FulfvsLNnrfxCeygmZrDoOOCL1HpvnBG4vskc
 Fp6NK83l974QnyLdPsjr1yB2d2pgb+uMP1v76IukQi0IjNSAyvwSa5nloPTHRzpC
 j6e2cFpdtX+6vEu6KngXVKTblSEnwhVBTaTR37Lr8PX1sZqFS/+mjRDgg3HZa/GI
 u0fC0mQyVL9KjDs5LJGpTc/qs8J4mpoS5+dfzn38MI76dFxd5TYZKWVfILTrOtDF
 ugDnoLkMuYFdueKI2M3YzxXyaA7HBT+7McAdENuJJzJnEuSAZs0=
 =JvA2
 -----END PGP SIGNATURE-----

Merge tag 'bcachefs-2024-01-10' of https://evilpiepirate.org/git/bcachefs

Pull bcachefs updates from Kent Overstreet:

 - btree write buffer rewrite: instead of adding keys to the btree write
   buffer at transaction commit time, we now journal them with a
   different journal entry type and copy them from the journal to the
   write buffer just prior to journal write.

   This reduces the number of atomic operations on shared cachelines in
   the transaction commit path and is a signicant performance
   improvement on some workloads: multithreaded 4k random writes went
   from ~650k iops to ~850k iops.

 - Bring back optimistic spinning for six locks: the new implementation
   doesn't use osq locks; instead we add to the lock waitlist as normal,
   and then spin on the lock_acquired bit in the waitlist entry, _not_
   the lock itself.

 - New ioctls:

    - BCH_IOCTL_DEV_USAGE_V2, which allows for new data types

    - BCH_IOCTL_OFFLINE_FSCK, which runs the kernel implementation of
      fsck but without mounting: useful for transparently using the
      kernel version of fsck from 'bcachefs fsck' when the kernel
      version is a better match for the on disk filesystem.

    - BCH_IOCTL_ONLINE_FSCK: online fsck. Not all passes are supported
      yet, but the passes that are supported are fully featured - errors
      may be corrected as normal.

   The new ioctls use the new 'thread_with_file' abstraction for kicking
   off a kthread that's tied to a file descriptor returned to userspace
   via the ioctl.

 - btree_paths within a btree_trans are now dynamically growable,
   instead of being limited to 64. This is important for the
   check_directory_structure phase of fsck, and also fixes some issues
   we were having with btree path overflow in the reflink btree.

 - Trigger refactoring; prep work for the upcoming disk space accounting
   rewrite

 - Numerous bugfixes :)

* tag 'bcachefs-2024-01-10' of https://evilpiepirate.org/git/bcachefs: (226 commits)
  bcachefs: eytzinger0_find() search should be const
  bcachefs: move "ptrs not changing" optimization to bch2_trigger_extent()
  bcachefs: fix simulateously upgrading & downgrading
  bcachefs: Restart recovery passes more reliably
  bcachefs: bch2_dump_bset() doesn't choke on u64s == 0
  bcachefs: improve checksum error messages
  bcachefs: improve validate_bset_keys()
  bcachefs: print sb magic when relevant
  bcachefs: __bch2_sb_field_to_text()
  bcachefs: %pg is banished
  bcachefs: Improve would_deadlock trace event
  bcachefs: fsck_err()s don't need to manually check c->sb.version anymore
  bcachefs: Upgrades now specify errors to fix, like downgrades
  bcachefs: no thread_with_file in userspace
  bcachefs: Don't autofix errors we can't fix
  bcachefs: add missing bch2_latency_acct() call
  bcachefs: increase max_active on io_complete_wq
  bcachefs: add time_stats for btree_node_read_done()
  bcachefs: don't clear accessed bit in btree node fill
  bcachefs: Add an option to control btree node prefetching
  ...
2024-01-10 16:34:17 -08:00