The MII code does not check the return value of mdio_read (among
others), and therefore no error code should be sent. A previous fix to
the use of an uninitialized variable propagates negative error codes,
that might lead to wrong operations by the MII library.
An example of such issues is the use of mii_nway_restart by the dm9601
driver. The mii_nway_restart function does not check the value returned
by mdio_read, which in this case might be a negative number which could
contain the exact bit the function checks (BMCR_ANENABLE = 0x1000).
Return zero in case of error, as it is common practice in users of
mdio_read to avoid wrong uses of the return value.
Fixes: 8f8abb863fa5 ("net: usb: dm9601: fix uninitialized variable use in dm9601_mdio_read")
Signed-off-by: Javier Carrasco <javier.carrasco.cruz@gmail.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Peter Korsgaard <peter@korsgaard.com>
Link: https://lore.kernel.org/r/20240225-dm9601_ret_err-v1-1-02c1d959ea59@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
struct veth_rq is pretty large, 832B total without debug
options enabled. Since commit under Fixes we try to pre-allocate
enough queues for every possible CPU. Miao Wang reports that
this may lead to order-5 allocations which will fail in production.
Let the allocation fallback to vmalloc() and try harder.
These are the same flags we pass to netdev queue allocation.
Reported-and-tested-by: Miao Wang <shankerwangmiao@gmail.com>
Fixes: 9d3684c24a52 ("veth: create by default nr_possible_cpus queues")
Link: https://lore.kernel.org/all/5F52CAE2-2FB7-4712-95F1-3312FBBFA8DD@gmail.com/
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240223235908.693010-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
When rebuilding the lif after an FLR, be sure to restore the
current netdev features, not do the usual first time feature
init. This prevents losing user changes to things like TSO
or vlan tagging states.
Fixes: 45b84188a0a4 ("ionic: keep filters across FLR")
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Since we now have potential cases of NULL cmd_regs and info_regs
during a reset recovery, and left NULL if a reset recovery has
failed, we need to check that they exist before we use them.
Most of the cases were covered in the original patch where we
verify before doing the ioreadb() for health or cmd status.
However, we need to protect a few uses of io mem that could
be hit in error recovery or asynchronous threads calls as well
(e.g. ethtool or devlink handlers).
Fixes: 219e183272b4 ("ionic: no fw read when PCI reset failed")
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
AER recovery handler can trigger a PCI Reset after tearing
down the device setup in the error detection handler. The PCI
Reset handler will also attempt to tear down the device setup,
and this second tear down needs to know that it doesn't need
to call pci_release_regions() a second time. We can clear
num_bars on tear down and use that to decide later if we need
to clear the resources. This prevents a harmless but disturbing
warning message
resource: Trying to free nonexistent resource <0xXXXXXXXXXX-0xXXXXXXXXXX>
Fixes: c3a910e1c47a ("ionic: fill out pci error handlers")
Reviewed-by: Brett Creeley <brett.creeley@amd.com>
Signed-off-by: Shannon Nelson <shannon.nelson@amd.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Matthieu Baerts says:
====================
mptcp: more misc. fixes for v6.8
This series includes 6 types of fixes:
- Patch 1 fixes v4 mapped in v6 addresses support for the userspace PM,
when asking to delete a subflow. It was done everywhere else, but not
there. Patch 2 validates the modification, thanks to a subtest in
mptcp_join.sh. These patches can be backported up to v5.19.
- Patch 3 is a small fix for a recent bug-fix patch, just to avoid
printing an irrelevant warning (pr_warn()) once. It can be backported
up to v5.6, alongside the bug-fix that has been introduced in the
v6.8-rc5.
- Patches 4 to 6 are fixes for bugs found by Paolo while working on
TCP_NOTSENT_LOWAT support for MPTCP. These fixes can improve the
performances in some cases. Patches can be backported up to v5.6,
v5.11 and v6.7 respectively.
- Patch 7 makes sure 'ss -M' is available when starting MPTCP Join
selftest as it is required for some subtests since v5.18.
- Patch 8 fixes a possible double-free on socket dismantle. The issue
always existed, but was unnoticed because it was not causing any
problem so far. This fix can be backported up to v5.6.
- Patch 9 is a fix for a very recent patch causing lockdep warnings in
subflow diag. The patch causing the regression -- which fixes another
issue present since v5.7 -- should be part of the future v6.8-rc6.
Patch 10 validates the modification, thanks to a new subtest in
diag.sh.
====================
Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-0-162e87e48497@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The mptcp diag interface already experienced a few locking bugs
that lockdep and appropriate coverage have detected in advance.
Let's add a test-case triggering the relevant code path, to prevent
similar issues in the future.
Be careful to cope with very slow environments.
Note that we don't need an explicit timeout on the mptcp_connect
subprocess to cope with eventual bug/hang-up as the final cleanup
terminating the child processes will take care of that.
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-10-162e87e48497@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
when MPTCP server accepts an incoming connection, it clones its listener
socket. However, the pointer to 'inet_opt' for the new socket has the same
value as the original one: as a consequence, on program exit it's possible
to observe the following splat:
BUG: KASAN: double-free in inet_sock_destruct+0x54f/0x8b0
Free of addr ffff888485950880 by task swapper/25/0
CPU: 25 PID: 0 Comm: swapper/25 Kdump: loaded Not tainted 6.8.0-rc1+ #609
Hardware name: Supermicro SYS-6027R-72RF/X9DRH-7TF/7F/iTF/iF, BIOS 3.0 07/26/2013
Call Trace:
<IRQ>
dump_stack_lvl+0x32/0x50
print_report+0xca/0x620
kasan_report_invalid_free+0x64/0x90
__kasan_slab_free+0x1aa/0x1f0
kfree+0xed/0x2e0
inet_sock_destruct+0x54f/0x8b0
__sk_destruct+0x48/0x5b0
rcu_do_batch+0x34e/0xd90
rcu_core+0x559/0xac0
__do_softirq+0x183/0x5a4
irq_exit_rcu+0x12d/0x170
sysvec_apic_timer_interrupt+0x6b/0x80
</IRQ>
<TASK>
asm_sysvec_apic_timer_interrupt+0x16/0x20
RIP: 0010:cpuidle_enter_state+0x175/0x300
Code: 30 00 0f 84 1f 01 00 00 83 e8 01 83 f8 ff 75 e5 48 83 c4 18 44 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc fb 45 85 ed <0f> 89 60 ff ff ff 48 c1 e5 06 48 c7 43 18 00 00 00 00 48 83 44 2b
RSP: 0018:ffff888481cf7d90 EFLAGS: 00000202
RAX: 0000000000000000 RBX: ffff88887facddc8 RCX: 0000000000000000
RDX: 1ffff1110ff588b1 RSI: 0000000000000019 RDI: ffff88887fac4588
RBP: 0000000000000004 R08: 0000000000000002 R09: 0000000000043080
R10: 0009b02ea273363f R11: ffff88887fabf42b R12: ffffffff932592e0
R13: 0000000000000004 R14: 0000000000000000 R15: 00000022c880ec80
cpuidle_enter+0x4a/0xa0
do_idle+0x310/0x410
cpu_startup_entry+0x51/0x60
start_secondary+0x211/0x270
secondary_startup_64_no_verify+0x184/0x18b
</TASK>
Allocated by task 6853:
kasan_save_stack+0x1c/0x40
kasan_save_track+0x10/0x30
__kasan_kmalloc+0xa6/0xb0
__kmalloc+0x1eb/0x450
cipso_v4_sock_setattr+0x96/0x360
netlbl_sock_setattr+0x132/0x1f0
selinux_netlbl_socket_post_create+0x6c/0x110
selinux_socket_post_create+0x37b/0x7f0
security_socket_post_create+0x63/0xb0
__sock_create+0x305/0x450
__sys_socket_create.part.23+0xbd/0x130
__sys_socket+0x37/0xb0
__x64_sys_socket+0x6f/0xb0
do_syscall_64+0x83/0x160
entry_SYSCALL_64_after_hwframe+0x6e/0x76
Freed by task 6858:
kasan_save_stack+0x1c/0x40
kasan_save_track+0x10/0x30
kasan_save_free_info+0x3b/0x60
__kasan_slab_free+0x12c/0x1f0
kfree+0xed/0x2e0
inet_sock_destruct+0x54f/0x8b0
__sk_destruct+0x48/0x5b0
subflow_ulp_release+0x1f0/0x250
tcp_cleanup_ulp+0x6e/0x110
tcp_v4_destroy_sock+0x5a/0x3a0
inet_csk_destroy_sock+0x135/0x390
tcp_fin+0x416/0x5c0
tcp_data_queue+0x1bc8/0x4310
tcp_rcv_state_process+0x15a3/0x47b0
tcp_v4_do_rcv+0x2c1/0x990
tcp_v4_rcv+0x41fb/0x5ed0
ip_protocol_deliver_rcu+0x6d/0x9f0
ip_local_deliver_finish+0x278/0x360
ip_local_deliver+0x182/0x2c0
ip_rcv+0xb5/0x1c0
__netif_receive_skb_one_core+0x16e/0x1b0
process_backlog+0x1e3/0x650
__napi_poll+0xa6/0x500
net_rx_action+0x740/0xbb0
__do_softirq+0x183/0x5a4
The buggy address belongs to the object at ffff888485950880
which belongs to the cache kmalloc-64 of size 64
The buggy address is located 0 bytes inside of
64-byte region [ffff888485950880, ffff8884859508c0)
The buggy address belongs to the physical page:
page:0000000056d1e95e refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888485950700 pfn:0x485950
flags: 0x57ffffc0000800(slab|node=1|zone=2|lastcpupid=0x1fffff)
page_type: 0xffffffff()
raw: 0057ffffc0000800 ffff88810004c640 ffffea00121b8ac0 dead000000000006
raw: ffff888485950700 0000000000200019 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
Memory state around the buggy address:
ffff888485950780: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff888485950800: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff888485950880: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
^
ffff888485950900: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
ffff888485950980: 00 00 00 00 00 01 fc fc fc fc fc fc fc fc fc fc
Something similar (a refcount underflow) happens with CALIPSO/IPv6. Fix
this by duplicating IP / IPv6 options after clone, so that
ip{,6}_sock_destruct() doesn't end up freeing the same memory area twice.
Fixes: cf7da0d66cc1 ("mptcp: Create SUBFLOW socket for incoming connections")
Cc: stable@vger.kernel.org
Signed-off-by: Davide Caratti <dcaratti@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-8-162e87e48497@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commands 'ss -M' are used in script mptcp_join.sh to display only MPTCP
sockets. So it must be checked if ss tool supports MPTCP in this script.
Fixes: e274f7154008 ("selftests: mptcp: add subflow limits test-cases")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-7-162e87e48497@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After the blamed commit below, the send buffer auto-tuning can
happen after that the mptcp_propagate_sndbuf() completes - via
the delegated action infrastructure.
We must check for write space even after such change or we risk
missing the wake-up event.
Fixes: 8005184fd1ca ("mptcp: refactor sndbuf auto-tuning")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-6-162e87e48497@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Such value should be inherited from the first subflow, but
passive sockets always used 'rsk_rcv_wnd'.
Fixes: 6f8a612a33e4 ("mptcp: keep track of advertised windows right edge")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-5-162e87e48497@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
when inserting not contiguous data in the subflow write queue,
the protocol creates a new skb and prevent the TCP stack from
merging it later with already queued skbs by setting the EOR marker.
Still no push flag is explicitly set at the end of previous GSO
packet, making the aggregation on the receiver side sub-optimal -
and packetdrill self-tests less predictable.
Explicitly mark the end of not contiguous DSS with the push flag.
Fixes: 6d0060f600ad ("mptcp: Write MPTCP DSS headers to outgoing data packets")
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-4-162e87e48497@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
After the 'Fixes' commit mentioned below, the client side might print
the following warning once when a subflow is fully established at the
reception of any valid additional ack:
MPTCP: bogus mpc option on established client sk
That's a normal situation, and no warning should be printed for that. We
can then skip the check when the label is used.
Fixes: e4a0fa47e816 ("mptcp: corner case locking for rx path fields initialization")
Cc: stable@vger.kernel.org
Suggested-by: Paolo Abeni <pabeni@redhat.com>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-3-162e87e48497@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Now both a v4 address and a v4-mapped address are supported when
destroying a userspace pm subflow, this patch adds a second subflow
to "userspace pm add & remove address" test, and two subflows could
be removed two different ways, one with the v4mapped and one with v4.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/387
Fixes: 48d73f609dcc ("selftests: mptcp: update userspace pm addr tests")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-2-162e87e48497@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Address family of server side mismatches with that of client side, like
in "userspace pm add & remove address" test:
userspace_pm_add_addr $ns1 10.0.2.1 10
userspace_pm_rm_sf $ns1 "::ffff:10.0.2.1" $SUB_ESTABLISHED
That's because on the server side, the family is set to AF_INET6 and the
v4 address is mapped in a v6 one.
This patch fixes this issue. In mptcp_pm_nl_subflow_destroy_doit(), before
checking local address family with remote address family, map an IPv4
address to an IPv6 address if the pair is a v4-mapped address.
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/387
Fixes: 702c2f646d42 ("mptcp: netlink: allow userspace-driven subflow establishment")
Cc: stable@vger.kernel.org
Signed-off-by: Geliang Tang <tanggeliang@kylinos.cn>
Reviewed-by: Mat Martineau <martineau@kernel.org>
Reviewed-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Link: https://lore.kernel.org/r/20240223-upstream-net-20240223-misc-fixes-v1-1-162e87e48497@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This fixes a possible UAF in if_nlmsg_size(),
which can run without RTNL.
Add rcu protection to "struct dpll_pin"
Move netdev_dpll_pin() from netdevice.h to dpll.h to
decrease name pollution.
Note: This looks possible to no longer acquire RTNL in
netdev_dpll_pin_assign() later in net-next.
v2: do not force rcu_read_lock() in rtnl_dpll_pin_size() (Jiri Pirko)
Fixes: 5f1842692880 ("netdev: expose DPLL pin handle for netdevice")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Arkadiusz Kubalewski <arkadiusz.kubalewski@intel.com>
Cc: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240223123208.3543319-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Same as LAN7800, LAN7850 can be used without EEPROM. If EEPROM is not
present or not flashed, LAN7850 will fail to sync the speed detected by the PHY
with the MAC. In case link speed is 100Mbit, it will accidentally work,
otherwise no data can be transferred.
Better way would be to implement link_up callback, or set auto speed
configuration unconditionally. But this changes would be more intrusive.
So, for now, set it only if no EEPROM is found.
Fixes: e69647a19c87 ("lan78xx: Set ASD in MAC_CR when EEE is enabled.")
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20240222123839.2816561-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It seems that if userspace provides a correct IFA_TARGET_NETNSID value
but no IFA_ADDRESS and IFA_LOCAL attributes, inet6_rtm_getaddr()
returns -EINVAL with an elevated "struct net" refcount.
Fixes: 6ecf4c37eb3e ("ipv6: enable IFA_TARGET_NETNSID for RTM_GETADDR")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: David Ahern <dsahern@kernel.org>
Reviewed-by: David Ahern <dsahern@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Test that we keep GRO flag in sync when XDP is disabled while
the device is closed.
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
veth sets NETIF_F_GRO automatically when XDP is enabled,
because both features use the same NAPI machinery.
The logic to clear NETIF_F_GRO sits in veth_disable_xdp() which
is called both on ndo_stop and when XDP is turned off.
To avoid the flag from being cleared when the device is brought
down, the clearing is skipped when IFF_UP is not set.
Bringing the device down should indeed not modify its features.
Unfortunately, this means that clearing is also skipped when
XDP is disabled _while_ the device is down. And there's nothing
on the open path to bring the device features back into sync.
IOW if user enables XDP, disables it and then brings the device
up we'll end up with a stray GRO flag set but no NAPI instances.
We don't depend on the GRO flag on the datapath, so the datapath
won't crash. We will crash (or hang), however, next time features
are sync'ed (either by user via ethtool or peer changing its config).
The GRO flag will go away, and veth will try to disable the NAPIs.
But the open path never created them since XDP was off, the GRO flag
was a stray. If NAPI was initialized before we'll hang in napi_disable().
If it never was we'll crash trying to stop uninitialized hrtimer.
Move the GRO flag updates to the XDP enable / disable paths,
instead of mixing them with the ndo_open / ndo_close paths.
Fixes: d3256efd8e8b ("veth: allow enabling NAPI even without XDP")
Reported-by: Thomas Gleixner <tglx@linutronix.de>
Reported-by: syzbot+039399a9b96297ddedca@syzkaller.appspotmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Toke Høiland-Jørgensen <toke@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
When a queue(tfile) is detached, we only update tfile's queue_index,
but do not update xdp_rxq_info's queue_index. This patch fixes it.
Fixes: 8bf5c4ee1889 ("tun: setup xdp_rxq_info")
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Link: https://lore.kernel.org/r/1708398727-46308-1-git-send-email-wangyunjian@huawei.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Commit 3ce4f9c3fbb3 ("net/ps3_gelic_net: Add gelic_descr structures") of
6.8-rc1 had a copy-and-paste error where the pointer that holds the
allocated SKB (struct gelic_descr.skb) was set to NULL after the SKB was
allocated. This resulted in a kernel panic when the SKB pointer was
accessed.
This fix moves the initialization of the gelic_descr to before the SKB
is allocated.
Reported-by: sambat goson <sombat3960@gmail.com>
Fixes: 3ce4f9c3fbb3 ("net/ps3_gelic_net: Add gelic_descr structures")
Signed-off-by: Geoff Levand <geoff@infradead.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Since commit 5d93cfcf7360 ("net: dpaa: Convert to phylink"), we support
the "10gbase-r" phy-mode through a driver-based conversion of "xgmii",
but we still don't actually support it when the device tree specifies
"10gbase-r" proper.
This is because boards such as LS1046A-RDB do not define pcs-handle-names
(for whatever reason) in the ethernet@f0000 device tree node, and the
code enters through this code path:
err = of_property_match_string(mac_node, "pcs-handle-names", "xfi");
// code takes neither branch and falls through
if (err >= 0) {
(...)
} else if (err != -EINVAL && err != -ENODATA) {
goto _return_fm_mac_free;
}
(...)
/* For compatibility, if pcs-handle-names is missing, we assume this
* phy is the first one in pcsphy-handle
*/
err = of_property_match_string(mac_node, "pcs-handle-names", "sgmii");
if (err == -EINVAL || err == -ENODATA)
pcs = memac_pcs_create(mac_node, 0); // code takes this branch
else if (err < 0)
goto _return_fm_mac_free;
else
pcs = memac_pcs_create(mac_node, err);
// A default PCS is created and saved in "pcs"
// This determination fails and mistakenly saves the default PCS
// memac->sgmii_pcs instead of memac->xfi_pcs, because at this
// stage, mac_dev->phy_if == PHY_INTERFACE_MODE_10GBASER.
if (err && mac_dev->phy_if == PHY_INTERFACE_MODE_XGMII)
memac->xfi_pcs = pcs;
else
memac->sgmii_pcs = pcs;
In other words, in the absence of pcs-handle-names, the default
xfi_pcs assignment logic only works when in the device tree we have
PHY_INTERFACE_MODE_XGMII.
By reversing the order between the fallback xfi_pcs assignment and the
"xgmii" overwrite with "10gbase-r", we are able to support both values
in the device tree, with identical behavior.
Currently, it is impossible to make the s/xgmii/10gbase-r/ device tree
conversion, because it would break forward compatibility (new device
tree with old kernel). The only way to modify existing device trees to
phy-interface-mode = "10gbase-r" is to fix stable kernels to accept this
value and handle it properly.
One reason why the conversion is desirable is because with pre-phylink
kernels, the Aquantia PHY driver used to warn about the improper use
of PHY_INTERFACE_MODE_XGMII [1]. It is best to have a single (latest)
device tree that works with all supported stable kernel versions.
Note that the blamed commit does not constitute a regression per se.
Older stable kernels like 6.1 still do not work with "10gbase-r", but
for a different reason. That is a battle for another time.
[1] https://lore.kernel.org/netdev/20240214-ls1046-dts-use-10gbase-r-v1-1-8c2d68547393@concurrent-rt.com/
Fixes: 5d93cfcf7360 ("net: dpaa: Convert to phylink")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Sean Anderson <sean.anderson@seco.com>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Currently, mctp_local_output only takes ownership of skb on success, and
we may leak an skb if mctp_local_output fails in specific states; the
skb ownership isn't transferred until the actual output routing occurs.
Instead, make mctp_local_output free the skb on all error paths up to
the route action, so it always consumes the passed skb.
Fixes: 833ef3b91de6 ("mctp: Populate socket implementation")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240220081053.1439104-1-jk@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tony Nguyen says:
====================
Intel Wired LAN Driver Updates 2024-02-20 (ice)
This series contains updates to ice driver only.
Yochai sets parent device to properly reflect connection state between
source DPLL and output pin.
Arkadiusz fixes additional issues related to DPLL; proper reporting of
phase_adjust value and preventing use/access of data while resetting.
Amritha resolves ASSERT_RTNL() being triggered on certain reset/rebuild
flows.
* '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/net-queue:
ice: Fix ASSERT_RTNL() warning during certain scenarios
ice: fix pin phase adjust updates on PF reset
ice: fix dpll periodic work data updates on PF reset
ice: fix dpll and dpll_pin data access on PF reset
ice: fix dpll input pin phase_adjust value updates
ice: fix connection state of DPLL and out pin
====================
Reviewed-by: Vadim Fedorenko <vadim.fedorenko@linux.dev>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Link: https://lore.kernel.org/r/20240220214444.1039759-1-anthony.l.nguyen@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
syzkaller triggered following kasan splat:
BUG: KASAN: use-after-free in __skb_flow_dissect+0x19d1/0x7a50 net/core/flow_dissector.c:1170
Read of size 1 at addr ffff88812fb4000e by task syz-executor183/5191
[..]
kasan_report+0xda/0x110 mm/kasan/report.c:588
__skb_flow_dissect+0x19d1/0x7a50 net/core/flow_dissector.c:1170
skb_flow_dissect_flow_keys include/linux/skbuff.h:1514 [inline]
___skb_get_hash net/core/flow_dissector.c:1791 [inline]
__skb_get_hash+0xc7/0x540 net/core/flow_dissector.c:1856
skb_get_hash include/linux/skbuff.h:1556 [inline]
ip_tunnel_xmit+0x1855/0x33c0 net/ipv4/ip_tunnel.c:748
ipip_tunnel_xmit+0x3cc/0x4e0 net/ipv4/ipip.c:308
__netdev_start_xmit include/linux/netdevice.h:4940 [inline]
netdev_start_xmit include/linux/netdevice.h:4954 [inline]
xmit_one net/core/dev.c:3548 [inline]
dev_hard_start_xmit+0x13d/0x6d0 net/core/dev.c:3564
__dev_queue_xmit+0x7c1/0x3d60 net/core/dev.c:4349
dev_queue_xmit include/linux/netdevice.h:3134 [inline]
neigh_connected_output+0x42c/0x5d0 net/core/neighbour.c:1592
...
ip_finish_output2+0x833/0x2550 net/ipv4/ip_output.c:235
ip_finish_output+0x31/0x310 net/ipv4/ip_output.c:323
..
iptunnel_xmit+0x5b4/0x9b0 net/ipv4/ip_tunnel_core.c:82
ip_tunnel_xmit+0x1dbc/0x33c0 net/ipv4/ip_tunnel.c:831
ipgre_xmit+0x4a1/0x980 net/ipv4/ip_gre.c:665
__netdev_start_xmit include/linux/netdevice.h:4940 [inline]
netdev_start_xmit include/linux/netdevice.h:4954 [inline]
xmit_one net/core/dev.c:3548 [inline]
dev_hard_start_xmit+0x13d/0x6d0 net/core/dev.c:3564
...
The splat occurs because skb->data points past skb->head allocated area.
This is because neigh layer does:
__skb_pull(skb, skb_network_offset(skb));
... but skb_network_offset() returns a negative offset and __skb_pull()
arg is unsigned. IOW, we skb->data gets "adjusted" by a huge value.
The negative value is returned because skb->head and skb->data distance is
more than 64k and skb->network_header (u16) has wrapped around.
The bug is in the ip_tunnel infrastructure, which can cause
dev->needed_headroom to increment ad infinitum.
The syzkaller reproducer consists of packets getting routed via a gre
tunnel, and route of gre encapsulated packets pointing at another (ipip)
tunnel. The ipip encapsulation finds gre0 as next output device.
This results in the following pattern:
1). First packet is to be sent out via gre0.
Route lookup found an output device, ipip0.
2).
ip_tunnel_xmit for gre0 bumps gre0->needed_headroom based on the future
output device, rt.dev->needed_headroom (ipip0).
3).
ip output / start_xmit moves skb on to ipip0. which runs the same
code path again (xmit recursion).
4).
Routing step for the post-gre0-encap packet finds gre0 as output device
to use for ipip0 encapsulated packet.
tunl0->needed_headroom is then incremented based on the (already bumped)
gre0 device headroom.
This repeats for every future packet:
gre0->needed_headroom gets inflated because previous packets' ipip0 step
incremented rt->dev (gre0) headroom, and ipip0 incremented because gre0
needed_headroom was increased.
For each subsequent packet, gre/ipip0->needed_headroom grows until
post-expand-head reallocations result in a skb->head/data distance of
more than 64k.
Once that happens, skb->network_header (u16) wraps around when
pskb_expand_head tries to make sure that skb_network_offset() is unchanged
after the headroom expansion/reallocation.
After this skb_network_offset(skb) returns a different (and negative)
result post headroom expansion.
The next trip to neigh layer (or anything else that would __skb_pull the
network header) makes skb->data point to a memory location outside
skb->head area.
v2: Cap the needed_headroom update to an arbitarily chosen upperlimit to
prevent perpetual increase instead of dropping the headroom increment
completely.
Reported-and-tested-by: syzbot+bfde3bef047a81b8fde6@syzkaller.appspotmail.com
Closes: https://groups.google.com/g/syzkaller-bugs/c/fL9G6GtWskY/m/VKk_PR5FBAAJ
Fixes: 243aad830e8a ("ip_gre: include route header_len in max_headroom calculation")
Signed-off-by: Florian Westphal <fw@strlen.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240220135606.4939-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch adds support for the SYS TEC USB-SPEmodule1 10Base-T1L
ethernet device to the existing smsc95xx driver by adding the new
USB VID/PID pair.
Signed-off-by: Andre Werner <andre.werner@systec-electronic.com>
Link: https://lore.kernel.org/r/20240219053413.4732-1-andre.werner@systec-electronic.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
BUG: KMSAN: uninit-value in nla_validate_range_unsigned lib/nlattr.c:222 [inline]
BUG: KMSAN: uninit-value in nla_validate_int_range lib/nlattr.c:336 [inline]
BUG: KMSAN: uninit-value in validate_nla lib/nlattr.c:575 [inline]
BUG: KMSAN: uninit-value in __nla_validate_parse+0x2e20/0x45c0 lib/nlattr.c:631
nla_validate_range_unsigned lib/nlattr.c:222 [inline]
nla_validate_int_range lib/nlattr.c:336 [inline]
validate_nla lib/nlattr.c:575 [inline]
...
The message in question matches this policy:
[NFTA_TARGET_REV] = NLA_POLICY_MAX(NLA_BE32, 255),
but because NLA_BE32 size in minlen array is 0, the validation
code will read past the malformed (too small) attribute.
Note: Other attributes, e.g. BITFIELD32, SINT, UINT.. are also missing:
those likely should be added too.
Reported-by: syzbot+3f497b07aa3baf2fb4d0@syzkaller.appspotmail.com
Reported-by: xingwei lee <xrivendell7@gmail.com>
Closes: https://lore.kernel.org/all/CABOYnLzFYHSnvTyS6zGa-udNX55+izqkOt2sB9WDqUcEGW6n8w@mail.gmail.com/raw
Fixes: ecaf75ffd5f5 ("netlink: introduce bigendian integer types")
Signed-off-by: Florian Westphal <fw@strlen.de>
Link: https://lore.kernel.org/r/20240221172740.5092-1-fw@strlen.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Current release - regressions:
- af_unix: fix another unix GC hangup
Previous releases - regressions:
- core: fix a possible AF_UNIX deadlock
- bpf: fix NULL pointer dereference in sk_psock_verdict_data_ready()
- netfilter: nft_flow_offload: release dst in case direct xmit path is used
- bridge: switchdev: ensure MDB events are delivered exactly once
- l2tp: pass correct message length to ip6_append_data
- dccp/tcp: unhash sk from ehash for tb2 alloc failure after check_estalblished()
- tls: fixes for record type handling with PEEK
- devlink: fix possible use-after-free and memory leaks in devlink_init()
Previous releases - always broken:
- bpf: fix an oops when attempting to read the vsyscall
page through bpf_probe_read_kernel
- sched: act_mirred: use the backlog for mirred ingress
- netfilter: nft_flow_offload: fix dst refcount underflow
- ipv6: sr: fix possible use-after-free and null-ptr-deref
- mptcp: fix several data races
- phonet: take correct lock to peek at the RX queue
Misc:
- handful of fixes and reliability improvements for selftests
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmXXKMMSHHBhYmVuaUBy
ZWRoYXQuY29tAAoJECkkeY3MjxOkmgAQAIV2NAVEvHVBtnm0Df9PuCcHQx6i9veS
tGxOZMVwb5ePFI+dpiNyyn61koEiRuFLOm66pfJAuT5j5z6m4PEFfPZgtiVpCHVK
4sz4UD4+jVLmYijv+YlWkPU3RWR0RejSkDbXwY5Y9Io/DWHhA2iq5IyMy2MncUPY
dUc12ddEsYRH60Kmm2/96FcdbHw9Y64mDC8tIeIlCAQfng4U98EXJbCq9WXsPPlW
vjwSKwRG76QGDugss9XkatQ7Bsva1qTobFGDOvBMQpMt+dr81pTGVi0c1h/drzvI
EJaDO8jJU3Xy0pQ80beboCJ1KlVCYhWSmwlBMZUA1f0lA2m3U5UFEtHA5hHKs3Mi
jNe/sgKXzThrro0fishAXbzrro2QDhCG3Vm4PRlOGexIyy+n0gIp1lHwEY1p2vX9
RJPdt1e3xt/5NYRv6l2GVQYFi8Wd0endgzCdJeXk0OWQFLFtnxhG6ejpgxtgN0fp
CzKU6orFpsddQtcEOdIzKMUA3CXYWAdQPXOE5Ptjoz3MXZsQqtMm3vN4and8jJ19
8/VLsCNPp11bSRTmNY3Xt85e+gjIA2mRwgRo+ieL6b1x2AqNeVizlr6IZWYQ4TdG
rUdlEX0IVmov80TSeQoWgtzTO7xMER+qN6FxAs3pQoUFjtol3pEURq9FQ2QZ8jW4
5rKpNBrjKxdk
=eUOc
-----END PGP SIGNATURE-----
Merge tag 'net-6.8.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Pull networking fixes from Paolo Abeni:
"Including fixes from bpf and netfilter.
Current release - regressions:
- af_unix: fix another unix GC hangup
Previous releases - regressions:
- core: fix a possible AF_UNIX deadlock
- bpf: fix NULL pointer dereference in sk_psock_verdict_data_ready()
- netfilter: nft_flow_offload: release dst in case direct xmit path
is used
- bridge: switchdev: ensure MDB events are delivered exactly once
- l2tp: pass correct message length to ip6_append_data
- dccp/tcp: unhash sk from ehash for tb2 alloc failure after
check_estalblished()
- tls: fixes for record type handling with PEEK
- devlink: fix possible use-after-free and memory leaks in
devlink_init()
Previous releases - always broken:
- bpf: fix an oops when attempting to read the vsyscall page through
bpf_probe_read_kernel
- sched: act_mirred: use the backlog for mirred ingress
- netfilter: nft_flow_offload: fix dst refcount underflow
- ipv6: sr: fix possible use-after-free and null-ptr-deref
- mptcp: fix several data races
- phonet: take correct lock to peek at the RX queue
Misc:
- handful of fixes and reliability improvements for selftests"
* tag 'net-6.8.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
l2tp: pass correct message length to ip6_append_data
net: phy: realtek: Fix rtl8211f_config_init() for RTL8211F(D)(I)-VD-CG PHY
selftests: ioam: refactoring to align with the fix
Fix write to cloned skb in ipv6_hop_ioam()
phonet/pep: fix racy skb_queue_empty() use
phonet: take correct lock to peek at the RX queue
net: sparx5: Add spinlock for frame transmission from CPU
net/sched: flower: Add lock protection when remove filter handle
devlink: fix port dump cmd type
net: stmmac: Fix EST offset for dwmac 5.10
tools: ynl: don't leak mcast_groups on init error
tools: ynl: make sure we always pass yarg to mnl_cb_run
net: mctp: put sock on tag allocation failure
netfilter: nf_tables: use kzalloc for hook allocation
netfilter: nf_tables: register hooks last when adding new chain/flowtable
netfilter: nft_flow_offload: release dst in case direct xmit path is used
netfilter: nft_flow_offload: reset dst in route object after setting up flow
netfilter: nf_tables: set dormant flag on hook register failure
selftests: tls: add test for peeking past a record of a different type
selftests: tls: add test for merging of same-type control messages
...
- While working on the ring buffer I noticed that the counter used
for knowing where the end of the data is on a sub-buffer was not
a full "int" but just 20 bits. It was masked out to 0xfffff.
With the new code that allows the user to change the size of the
sub-buffer, it is theoretically possible to ask for a size
bigger than 2^20. If that happens, unexpected results may
occur as there's no code checking if the counter overflowed the
20 bits of the write mask. There are other checks to make sure
events fit in the sub-buffer, but if the sub-buffer itself is
too big, that is not checked.
Add a check in the resize of the sub-buffer to make sure that it
never goes beyond the size of the counter that holds how much
data is on it.
-----BEGIN PGP SIGNATURE-----
iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZdaf+RQccm9zdGVkdEBn
b29kbWlzLm9yZwAKCRAp5XQQmuv6qjEIAQDpsvHqUFNoG5fkRlWr2U0hNl5M6zLI
xTf2mWoG/h8bwQD+NfiRC2UrD5EaubO15z0z6MxScOl1H9X+iI7WVwZkqQ8=
=txOr
-----END PGP SIGNATURE-----
Merge tag 'trace-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace
Pull tracing fix from Steven Rostedt:
- While working on the ring buffer I noticed that the counter used for
knowing where the end of the data is on a sub-buffer was not a full
"int" but just 20 bits. It was masked out to 0xfffff.
With the new code that allows the user to change the size of the
sub-buffer, it is theoretically possible to ask for a size bigger
than 2^20. If that happens, unexpected results may occur as there's
no code checking if the counter overflowed the 20 bits of the write
mask. There are other checks to make sure events fit in the
sub-buffer, but if the sub-buffer itself is too big, that is not
checked.
Add a check in the resize of the sub-buffer to make sure that it
never goes beyond the size of the counter that holds how much data is
on it.
* tag 'trace-v6.8-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace:
ring-buffer: Do not let subbuf be bigger than write mask
l2tp_ip6_sendmsg needs to avoid accounting for the transport header
twice when splicing more data into an already partially-occupied skbuff.
To manage this, we check whether the skbuff contains data using
skb_queue_empty when deciding how much data to append using
ip6_append_data.
However, the code which performed the calculation was incorrect:
ulen = len + skb_queue_empty(&sk->sk_write_queue) ? transhdrlen : 0;
...due to C operator precedence, this ends up setting ulen to
transhdrlen for messages with a non-zero length, which results in
corrupted packets on the wire.
Add parentheses to correct the calculation in line with the original
intent.
Fixes: 9d4c75800f61 ("ipv4, ipv6: Fix handling of transhdrlen in __ip{,6}_append_data()")
Cc: David Howells <dhowells@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by: Tom Parkin <tparkin@katalix.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240220122156.43131-1-tparkin@katalix.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmXWjUEACgkQ1V2XiooU
IOQEqA//c15K4sL5v3ROUgDKYo7d5W7mnb3c9T2b2I4tZIyAj+f1+6DhGz2PB/5L
BjdAXbz2FrrkSt/x4fcu0CkcXC2d5tVcVhvt+CTpaph70xOXBL0XtN+x3NfXlZVg
r9Q/6tV3pBE6u6LdqogQsQtehhqYMgzPVfKuUVYbvM4efqV/vvKiBbHl5DedtKk0
GKiwGEKnXbBUxpJueSUAX/+C64Ldlhw4MVswkJfjA8r56FJEsxPet9tlAphqd+6P
qg1bECQf3NQw8DVBtMc9U1Izb8HhiGEskG72e450Uo1X7SL6EACDFCMVTAeEz+fu
sDNPpS/V7PEP3tImJm0Rj6N6iYGL19tWfCVMdesP+KF5yokNbQF54Xgz4ETkVyrt
EZkR5JL6pRLBQ7FJ/2TD0IIFEn09KayMXLI0Dlugl90lOsn1T6Dnmmh8563nZ2eT
6zio/4NqYRzCXSieCs3zHRZCH0l2tttkkKi0MhVJwGBJd8Wl0qeXHK/UxlE5Pfkn
qhD2ryuCHbIad2JxS8mb1pIzMhw8sy3LsxQQ91CShQ2ujTLY35dhcWJPDm8u77md
VE4lTUmj8uvExLjG/xf+5bzvTutYUGdacRmYwzyFTl/ix3tYvg75GoI4l2iGd4xl
6COxPtLaFDkxo5GxODLrmjxe11E8nT2BKI7rPsOjee8ueAzJZdM=
=iPIR
-----END PGP SIGNATURE-----
Merge tag 'nf-24-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf
Pablo Neira Ayuso says:
====================
Netfilter fixes for net
The following patchset contains Netfilter fixes for net:
1) If user requests to wake up a table and hook fails, restore the
dormant flag from the error path, from Florian Westphal.
2) Reset dst after transferring it to the flow object, otherwise dst
gets released twice from the error path.
3) Release dst in case the flowtable selects a direct xmit path, eg.
transmission to bridge port. Otherwise, dst is memleaked.
4) Register basechain and flowtable hooks at the end of the command.
Error path releases these datastructure without waiting for the
rcu grace period.
5) Use kzalloc() to initialize struct nft_hook to fix a KMSAN report
on access to hook type, also from Florian Westphal.
netfilter pull request 24-02-22
* tag 'nf-24-02-22' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
netfilter: nf_tables: use kzalloc for hook allocation
netfilter: nf_tables: register hooks last when adding new chain/flowtable
netfilter: nft_flow_offload: release dst in case direct xmit path is used
netfilter: nft_flow_offload: reset dst in route object after setting up flow
netfilter: nf_tables: set dormant flag on hook register failure
====================
Link: https://lore.kernel.org/r/20240222000843.146665-1-pablo@netfilter.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZdaBCwAKCRDbK58LschI
g3EhAP0d+S18mNabiEGz8efnE2yz3XcFchJgjiRS8WjOv75GvQEA6/sWncFjbc8k
EqxPHmeJa19rWhQlFrmlyNQfLYGe4gY=
=VkOs
-----END PGP SIGNATURE-----
Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf
Daniel Borkmann says:
====================
pull-request: bpf 2024-02-22
The following pull-request contains BPF updates for your *net* tree.
We've added 11 non-merge commits during the last 24 day(s) which contain
a total of 15 files changed, 217 insertions(+), 17 deletions(-).
The main changes are:
1) Fix a syzkaller-triggered oops when attempting to read the vsyscall
page through bpf_probe_read_kernel and friends, from Hou Tao.
2) Fix a kernel panic due to uninitialized iter position pointer in
bpf_iter_task, from Yafang Shao.
3) Fix a race between bpf_timer_cancel_and_free and bpf_timer_cancel,
from Martin KaFai Lau.
4) Fix a xsk warning in skb_add_rx_frag() (under CONFIG_DEBUG_NET)
due to incorrect truesize accounting, from Sebastian Andrzej Siewior.
5) Fix a NULL pointer dereference in sk_psock_verdict_data_ready,
from Shigeru Yoshida.
6) Fix a resolve_btfids warning when bpf_cpumask symbol cannot be
resolved, from Hari Bathini.
bpf-for-netdev
* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
bpf, sockmap: Fix NULL pointer dereference in sk_psock_verdict_data_ready()
selftests/bpf: Add negtive test cases for task iter
bpf: Fix an issue due to uninitialized bpf_iter_task
selftests/bpf: Test racing between bpf_timer_cancel_and_free and bpf_timer_cancel
bpf: Fix racing between bpf_timer_cancel_and_free and bpf_timer_cancel
selftest/bpf: Test the read of vsyscall page under x86-64
x86/mm: Disallow vsyscall page read for copy_from_kernel_nofault()
x86/mm: Move is_vsyscall_vaddr() into asm/vsyscall.h
bpf, scripts: Correct GPL license name
xsk: Add truesize to skb_add_rx_frag().
bpf: Fix warning for bpf_cpumask in verifier
====================
Link: https://lore.kernel.org/r/20240221231826.1404-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Commit bb726b753f75 ("net: phy: realtek: add support for
RTL8211F(D)(I)-VD-CG") extended support of the driver from the existing
support for RTL8211F(D)(I)-CG PHY to the newer RTL8211F(D)(I)-VD-CG PHY.
While that commit indicated that the RTL8211F_PHYCR2 register is not
supported by the "VD-CG" PHY model and therefore updated the corresponding
section in rtl8211f_config_init() to be invoked conditionally, the call to
"genphy_soft_reset()" was left as-is, when it should have also been invoked
conditionally. This is because the call to "genphy_soft_reset()" was first
introduced by the commit 0a4355c2b7f8 ("net: phy: realtek: add dt property
to disable CLKOUT clock") since the RTL8211F guide indicates that a PHY
reset should be issued after setting bits in the PHYCR2 register.
As the PHYCR2 register is not applicable to the "VD-CG" PHY model, fix the
rtl8211f_config_init() function by invoking "genphy_soft_reset()"
conditionally based on the presence of the "PHYCR2" register.
Fixes: bb726b753f75 ("net: phy: realtek: add support for RTL8211F(D)(I)-VD-CG")
Signed-off-by: Siddharth Vadapalli <s-vadapalli@ti.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240220070007.968762-1-s-vadapalli@ti.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Justin Iurman says:
====================
ioam6: fix write to cloned skb's
Make sure the IOAM data insertion is not applied on cloned skb's. As a
consequence, ioam selftests needed a refactoring.
====================
Link: https://lore.kernel.org/r/20240219135255.15429-1-justin.iurman@uliege.be
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
ioam6_parser uses a packet socket. After the fix to prevent writing to
cloned skb's, the receiver does not see its IOAM data anymore, which
makes input/forward ioam-selftests to fail. As a workaround,
ioam6_parser now uses an IPv6 raw socket and leverages ancillary data to
get hop-by-hop options. As a consequence, the hook is "after" the IOAM
data insertion by the receiver and all tests are working again.
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
ioam6_fill_trace_data() writes inside the skb payload without ensuring
it's writeable (e.g., not cloned). This function is called both from the
input and output path. The output path (ioam6_iptunnel) already does the
check. This commit provides a fix for the input path, inside
ipv6_hop_ioam(). It also updates ip6_parse_tlv() to refresh the network
header pointer ("nh") when returning from ipv6_hop_ioam().
Fixes: 9ee11f0fff20 ("ipv6: ioam: Data plane support for Pre-allocated Trace")
Reported-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Justin Iurman <justin.iurman@uliege.be>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The receive queues are protected by their respective spin-lock, not
the socket lock. This could lead to skb_peek() unexpectedly
returning NULL or a pointer to an already dequeued socket buffer.
Fixes: 9641458d3ec4 ("Phonet: Pipe End Point for Phonet Pipes protocol")
Signed-off-by: Rémi Denis-Courmont <courmisch@gmail.com>
Link: https://lore.kernel.org/r/20240218081214.4806-2-remi@remlab.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The receive queue is protected by its embedded spin-lock, not the
socket lock, so we need the former lock here (and only that one).
Fixes: 107d0d9b8d9a ("Phonet: Phonet datagram transport protocol")
Reported-by: Luosili <rootlab@huawei.com>
Signed-off-by: Rémi Denis-Courmont <courmisch@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://lore.kernel.org/r/20240218081214.4806-1-remi@remlab.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Both registers used when doing manual injection or fdma injection are
shared between all the net devices of the switch. It was noticed that
when having two process which each of them trying to inject frames on
different ethernet ports, that the HW started to behave strange, by
sending out more frames then expected. When doing fdma injection it is
required to set the frame in the DCB and then make sure that the next
pointer of the last DCB is invalid. But because there is no locks for
this, then easily this pointer between the DCB can be broken and then it
would create a loop of DCBs. And that means that the HW will
continuously transmit these frames in a loop. Until the SW will break
this loop.
Therefore to fix this issue, add a spin lock for when accessing the
registers for manual or fdma injection.
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Daniel Machon <daniel.machon@microchip.com>
Fixes: f3cad2611a77 ("net: sparx5: add hostmode with phylink support")
Link: https://lore.kernel.org/r/20240219080043.1561014-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
As IDR can't protect itself from the concurrent modification, place
idr_remove() under the protection of tp->lock.
Fixes: 08a0063df3ae ("net/sched: flower: Move filter handle initialization earlier")
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20240220085928.9161-1-jianbol@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Unlike other commands, due to a c&p error, port dump fills-up cmd with
wrong value, different from port-get request cmd, port-get doit reply
and port notification.
Fix it by filling cmd with value DEVLINK_CMD_PORT_NEW.
Skimmed through devlink userspace implementations, none of them cares
about this cmd value. Only ynl, for which, this is actually a fix, as it
expects doit and dumpit ops rsp_value to be the same.
Omit the fixes tag, even thought this is fix, better to target this for
next release.
Fixes: bfcd3a466172 ("Introduce devlink infrastructure")
Signed-off-by: Jiri Pirko <jiri@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Link: https://lore.kernel.org/r/20240220075245.75416-1-jiri@resnulli.us
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Fix EST offset for dwmac 5.10.
Currently configuring Qbv doesn't work as expected. The schedule is
configured, but never confirmed:
|[ 128.250219] imx-dwmac 428a0000.ethernet eth1: configured EST
The reason seems to be the refactoring of the EST code which set the wrong
EST offset for the dwmac 5.10. After fixing this it works as before:
|[ 106.359577] imx-dwmac 428a0000.ethernet eth1: configured EST
|[ 128.430715] imx-dwmac 428a0000.ethernet eth1: EST: SWOL has been switched
Tested on imx93.
Fixes: c3f3b97238f6 ("net: stmmac: Refactor EST implementation")
Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
Reviewed-by: Serge Semin <fancer.lancer@gmail.com>
Link: https://lore.kernel.org/r/20240220-stmmac_est-v1-1-c41f9ae2e7b7@linutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Make sure to free the already-parsed mcast_groups if
we don't get an ack from the kernel when reading family info.
This is part of the ynl_sock_create() error path, so we won't
get a call to ynl_sock_destroy() to free them later.
Fixes: 86878f14d71a ("tools: ynl: user space helpers")
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20240220161112.2735195-3-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
There is one common error handler in ynl - ynl_cb_error().
It expects priv to be a pointer to struct ynl_parse_arg AKA yarg.
To avoid potential crashes if we encounter a stray NLMSG_ERROR
always pass yarg as priv (or a struct which has it as the first
member).
ynl_cb_null() has a similar problem directly - it expects yarg
but priv passed by the caller is ys.
Found by code inspection.
Fixes: 86878f14d71a ("tools: ynl: user space helpers")
Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com>
Link: https://lore.kernel.org/r/20240220161112.2735195-2-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
We may hold an extra reference on a socket if a tag allocation fails: we
optimistically allocate the sk_key, and take a ref there, but do not
drop if we end up not using the allocated key.
Ensure we're dropping the sock on this failure by doing a proper unref
rather than directly kfree()ing.
Fixes: de8a6b15d965 ("net: mctp: add an explicit reference from a mctp_sk_key to sock")
Signed-off-by: Jeremy Kerr <jk@codeconstruct.com.au>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/ce9b61e44d1cdae7797be0c5e3141baf582d23a0.1707983487.git.jk@codeconstruct.com.au
Signed-off-by: Jakub Kicinski <kuba@kernel.org>