Commit Graph

1090553 Commits

Author SHA1 Message Date
Eric Dumazet
783d108dd7 tcp: drop skb dst in tcp_rcv_established()
In commit f84af32cbc ("net: ip_queue_rcv_skb() helper")
I dropped the skb dst in tcp_data_queue().

This only dealt with so-called TCP input slow path.

When fast path is taken, tcp_rcv_established() calls
tcp_queue_rcv() while skb still has a dst.

This was mostly fine, because most dsts at this point
are not refcounted (thanks to early demux)

However, TCP packets sent over loopback have refcounted dst.

Then commit 68822bdf76 ("net: generalize skb freeing
deferral to per-cpu lists") came and had the effect
of delaying skb freeing for an arbitrary time.

If during this time the involved netns is dismantled, cleanup_net()
frees the struct net with embedded net->ipv6.ip6_dst_ops.

Then when eventually dst_destroy_rcu() is called,
if (dst->ops->destroy) ... triggers an use-after-free.

It is not clear if ip6_route_net_exit() lacks a rcu_barrier()
as syzbot reported similar issues before the blamed commit.

( https://groups.google.com/g/syzkaller-bugs/c/CofzW4eeA9A/m/009WjumTAAAJ )

Fixes: 68822bdf76 ("net: generalize skb freeing deferral to per-cpu lists")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Neal Cardwell <ncardwell@google.com>
Acked-by: Soheil Hassas Yeganeh <soheil@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30 13:25:29 +01:00
David S. Miller
90e29e592e Merge branch 'lan966x-phy-reset-remove'
Michael Walle says:

====================
net: lan966x: remove PHY reset support

Remove the unneeded PHY reset node as well as the driver support for it.

This was already discussed [1] and I expect Microchip to Ack on this
removal. Since there is no user, no breakage is expected.

I'm not sure it this should go through net or net-next and if the patches
should have a Fixes: tag or not. In upstream linux there was never any user
of it, so there is no bug to be fixed. But OTOH if the schema fix isn't
backported, then there might be an older schema version still containing
the reset node. Thoughts?

The patches needed for the GPIO part are just waiting to be picked up by
Linus [2,3]. This patch and the GPIO parts are the last pieces of the
puzzle to get ethernet working on the LAN9668 on upstream linux.

[1] https://lore.kernel.org/netdev/20220330110210.3374165-1-michael@walle.cc/
[2] https://lore.kernel.org/linux-gpio/CACRpkdbxmN+SWt95aGHjA2ZGnN61aWaA7c5S4PaG+WePAj=htg@mail.gmail.com/
[3] https://lore.kernel.org/linux-gpio/20220420191926.3411830-1-michael@walle.cc/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30 13:09:26 +01:00
Michael Walle
5b06ef8682 net: lan966x: remove PHY reset support
The PHY subsystem as well as the MIIM mdio driver (in case of the
integrated PHYs) will take care of the resets. A separate reset driver
isn't needed. There is no in-tree user of this feature. Remove the
support.

Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30 13:09:26 +01:00
Michael Walle
4fdabd509d dt-bindings: net: lan966x: remove PHY reset
The PHY reset was intended to be a phandle for a special PHY reset
driver for the integrated PHYs as well as any external PHYs. It turns
out, that the culprit is how the reset of the switch device is done.
In particular, the switch reset also affects other subsystems like
the GPIO and the SGPIO block and it happens to be the case that the
reset lines of the external PHYs are connected to a common GPIO line.
Thus as soon as the switch issues a reset during probe time, all the
external PHYs will go into reset because all the GPIO lines will
switch to input and the pull-down on that signal will take effect.

So even if there was a special PHY reset driver, it (1) won't fix
the root cause of the problem and (2) it won't fix all the other
consumers of GPIO lines which will also be reset.

It turns out, the Ocelot SoC has the same weird behavior (or the
lack of a dedicated switch reset) and there the problem is already
solved and all the bits and pieces are already there and this PHY
reset property isn't not needed at all.

There are no users of this binding. Just remove it.

Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30 13:09:25 +01:00
David S. Miller
8fd813441e Merge branch 'ipv6-net-opts'
Pavel Begunkov says:

====================
generic net and ipv6 minor optimisations

1-3 inline simple functions that only reshuffle arguments possibly adding
extra zero args, and call another function. It was benchmarked before with
a bunch of extra patches, see for details

https://lore.kernel.org/netdev/cover.1648981570.git.asml.silence@gmail.com/

It may increase the binary size, but it's the right thing to do and at least
without modules it actually sheds some bytes for some standard-ish config.

   text    data     bss     dec     hex filename
9627200       0       0 9627200  92e640 ./arch/x86_64/boot/bzImage
   text    data     bss     dec     hex filename
9627104       0       0 9627104  92e5e0 ./arch/x86_64/boot/bzImage
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30 12:58:45 +01:00
Pavel Begunkov
58f71be58b ipv6: refactor ip6_finish_output2()
Throw neigh checks in ip6_finish_output2() under a single slow path if,
so we don't have the overhead in the hot path.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30 12:58:45 +01:00
Pavel Begunkov
4b143ed7dd ipv6: help __ip6_finish_output() inlining
There are two callers of __ip6_finish_output(), both are in
ip6_finish_output(). We can combine the call sites into one and handle
return code after, that will inline __ip6_finish_output().

Note, error handling under NET_XMIT_CN will only return 0 if
__ip6_finish_output() succeded, and in this case it return 0.
Considering that NET_XMIT_SUCCESS is 0, it'll be returning exactly the
same result for it as before.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30 12:58:45 +01:00
Pavel Begunkov
c526fd8f9f net: inline dev_queue_xmit()
Inline dev_queue_xmit() and dev_queue_xmit_accel(), they both are small
proxy functions doing nothing but redirecting the control flow to
__dev_queue_xmit().

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30 12:58:44 +01:00
Pavel Begunkov
657dd5f97b net: inline skb_zerocopy_iter_dgram
skb_zerocopy_iter_dgram() is a small proxy function, inline it. For
that, move __zerocopy_sg_from_iter into linux/skbuff.h

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30 12:58:44 +01:00
Pavel Begunkov
de32bc6aad net: inline sock_alloc_send_skb
sock_alloc_send_skb() is simple and just proxying to another function,
so we can inline it and cut associated overhead.

Signed-off-by: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-30 12:58:44 +01:00
Jakub Kicinski
0813aeee0d Merge branch 'tcp-pass-back-data-left-in-socket-after-receive' of git://git.kernel.org/pub/scm/linux/kernel/git/kuba/linux
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 19:12:05 -07:00
Jens Axboe
f94fd25cb0 tcp: pass back data left in socket after receive
This is currently done for CMSG_INQ, add an ability to do so via struct
msghdr as well and have CMSG_INQ use that too. If the caller sets
msghdr->msg_get_inq, then we'll pass back the hint in msghdr->msg_inq.

Rearrange struct msghdr a bit so we can add this member while shrinking
it at the same time. On a 64-bit build, it was 96 bytes before this
change and 88 bytes afterwards.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Link: https://lore.kernel.org/r/650c22ca-cffc-0255-9a05-2413a1e20826@kernel.dk
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 18:55:27 -07:00
Yinjun Zhang
7195464cf8 nfp: flower: utilize the tuple iifidx in offloading ct flows
The device info from which conntrack originates is stored in metadata
field of the ct flow to offload now, driver can utilize it to reduce
the number of offloaded flows.

v2: Drop inline keyword from get_netdev_from_rule() signature.
    The compiler can decide.

Signed-off-by: Yinjun Zhang <yinjun.zhang@corigine.com>
Signed-off-by: Louis Peens <louis.peens@corigine.com>
Signed-off-by: Simon Horman <simon.horman@corigine.com>
Link: https://lore.kernel.org/r/20220429075124.128589-1-simon.horman@corigine.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 18:44:46 -07:00
Pieter Jansen van Vuuren
78a9b3c47b sfc: add EF100 VF support via a write to sriov_numvfs
This patch extends the EF100 PF driver by adding .sriov_configure()
which would allow users to enable and disable virtual functions
using the sriov sysfs.

Signed-off-by: Pieter Jansen van Vuuren <pieter.jansen-van-vuuren@amd.com>
Signed-off-by: Edward Cree <ecree.xilinx@gmail.com>
Link: https://lore.kernel.org/r/75e74d9e-14ce-0524-9668-5ab735a7cf62@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 18:43:01 -07:00
Jakub Kicinski
4994d4fa99 Merge branch 'mptcp-path-manager-mode-selection'
Mat Martineau says:

====================
mptcp: Path manager mode selection

MPTCP already has an in-kernel path manager (PM) to add and remove TCP
subflows associated with a given MPTCP connection. This in-kernel PM has
been designed to handle typical server-side use cases, but is not very
flexible or configurable for client devices that may have more
complicated policies to implement.

This patch series from the MPTCP tree is the first step toward adding a
generic-netlink-based API for MPTCP path management, which a privileged
userspace daemon will be able to use to control subflow
establishment. These patches add a per-namespace sysctl to select the
default PM type (in-kernel or userspace) for new MPTCP sockets. New
self-tests confirm expected behavior when userspace PM is selected but
there is no daemon available to handle existing MPTCP PM events.

Subsequent patch series (already staged in the MPTCP tree) will add the
generic netlink path management API.
====================

Link: https://lore.kernel.org/r/20220427225002.231996-1-mathew.j.martineau@linux.intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 17:25:25 -07:00
Mat Martineau
5ac1d2d634 selftests: mptcp: Add tests for userspace PM type
These tests ensure that the in-kernel path manager is bypassed when
the userspace path manager is configured. Kernel code is still
responsible for ADD_ADDR echo, so also make sure that's working.

Tested-by: Geliang Tang <geliang.tang@suse.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Co-developed-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Matthieu Baerts <matthieu.baerts@tessares.net>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 17:25:14 -07:00
Mat Martineau
6bb63ccc25 mptcp: Add a per-namespace sysctl to set the default path manager type
The new net.mptcp.pm_type sysctl determines which path manager will be
used by each newly-created MPTCP socket.

v2: Handle builds without CONFIG_SYSCTL
v3: Clarify logic for type-specific PM init (Geliang Tang and Paolo Abeni)

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 17:25:14 -07:00
Mat Martineau
6961326e38 mptcp: Make kernel path manager check for userspace-managed sockets
Userspace-managed sockets should not have their subflows or
advertisements changed by the kernel path manager.

v3: Use helper function for PM mode (Paolo Abeni)

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 17:25:14 -07:00
Mat Martineau
14b06811be mptcp: Bypass kernel PM when userspace PM is enabled
When a MPTCP connection is managed by a userspace PM, bypass the kernel
PM for incoming advertisements and subflow events. Netlink events are
still sent to userspace.

v2: Remove unneeded check in mptcp_pm_rm_addr_received() (Kishen Maloor)
v3: Add and use helper function for PM mode (Paolo Abeni)

Acked-by: Paolo Abeni <pabeni@redhat.com>
Co-developed-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Kishen Maloor <kishen.maloor@intel.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 17:25:14 -07:00
Mat Martineau
d85a8fde71 mptcp: Add a member to mptcp_pm_data to track kernel vs userspace mode
When adding support for netlink path management commands, the kernel
needs to know whether paths are being controlled by the in-kernel path
manager or a userspace PM.

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 17:25:13 -07:00
Mat Martineau
9273b9d579 mptcp: Remove redundant assignments in path manager init
A few members of the mptcp_pm_data struct were assigned to hard-coded
values in mptcp_pm_data_reset(), and then immediately changed in
mptcp_pm_nl_data_init().

Instead, flatten all the assignments in to mptcp_pm_data_reset().

v2: Resolve conflicts due to rename of mptcp_pm_data_reset()
v4: Resolve conflict in mptcp_pm_data_reset()

Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Mat Martineau <mathew.j.martineau@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 17:25:13 -07:00
Jakub Kicinski
a41c653dc5 Merge branch 'net-phy-micrel-add-coma-mode-support'
Michael Walle says:

====================
net: phy: micrel: add coma mode support

Add support to disable coma mode by a GPIO line.
====================

Link: https://lore.kernel.org/r/20220427214406.1348872-1-michael@walle.cc
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 16:37:57 -07:00
Michael Walle
738871b092 net: phy: micrel: add coma mode GPIO
The LAN8814 has a coma mode pin which puts the PHY into isolate and
power-dowm mode. Unfortunately, the mode cannot be disabled by a
register. Usually, the input pin has a pull-up and connected to a GPIO
which can then be used to disable the mode. Try to get the GPIO and
deassert it.

Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 16:37:53 -07:00
Michael Walle
31d00ca4ce net: phy: micrel: move the PHY timestamping check
Both lan8814_ptp_init() and lan8814_ptp_probe_once() are only used if
PTP and PHY timestamping is enabed. Up until now the probe function just
returns early, if they are not needed. But we need the
phy_package_init_once() functionality for the coma mode GPIO setup. Move
the check into the functions itself.

Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 16:37:53 -07:00
Michael Walle
749c61e5b3 dt-bindings: net: micrel: add coma-mode-gpios property
The LAN8814 has a coma mode pin which is used to put the PHY into
isolate and power-down mode. Usually strapped to be asserted by default.
A GPIO is then used to take the PHY out of this mode.

Signed-off-by: Michael Walle <michael@walle.cc>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-29 16:37:53 -07:00
David S. Miller
17d49e6e80 Merge branch 'remove-NAPI_POLL_WEIGHT-copies'
Merge branch 'remove-NAPI_POLL_WEIGHT-copies'

Jakub Kicinski says:

====================
remove copies of the NAPI_POLL_WEIGHT define

netif_napi_add() takes weight as the last argument. The value of
that parameter is hard to come up with and depends on many factors,
so driver authors are encouraged to use NAPI_POLL_WEIGHT.

We should probably move weight to an "advanced" version of the API
(__netif_napi_add()?) and simplify the life of most driver authors.

In preparation for such API changes this series removes local
defines equivalent to NAPI_POLL_WEIGHT from drivers, so that a simple
coccinelle / spatch script does not get thrown off by them.

v2:
 - drop staging bits (patch 2)
 - fix subject (patch 8)
 - add qeth change (patch 15)
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:57:01 +01:00
Jakub Kicinski
4bb0c7f09a qeth: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Alexandra Winter <wintera@linux.ibm.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:42 +01:00
Jakub Kicinski
e9c6ec6510 eth: velocity: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:42 +01:00
Jakub Kicinski
26450aa7ca eth: spider: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Acked-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:42 +01:00
Jakub Kicinski
288696565f eth: vxge: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:42 +01:00
Jakub Kicinski
bbbe6ecbc3 eth: gfar: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:42 +01:00
Jakub Kicinski
e702def527 eth: benet: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:41 +01:00
Jakub Kicinski
0258f5399f eth: atlantic: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:41 +01:00
Jakub Kicinski
592df36637 net: bgmac: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Reviewed-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:41 +01:00
Jakub Kicinski
b3c2b61ef6 slic: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Lino Sanfilippo <LinoSanfilippo@gmx.de>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:41 +01:00
Jakub Kicinski
f130683b1e usb: lan78xx: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:41 +01:00
Jakub Kicinski
889e3691b9 eth: mtk_eth_soc: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:41 +01:00
Jakub Kicinski
feda771f1b eth: pch_gbe: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Reviewed-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:41 +01:00
Jakub Kicinski
055e13f31f eth: cpsw: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:41 +01:00
Jakub Kicinski
e2a303295d eth: smsc: remove a copy of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:41 +01:00
Jakub Kicinski
5f012b40ef eth: remove copies of the NAPI_POLL_WEIGHT define
Defining local versions of NAPI_POLL_WEIGHT with the same
values in the drivers just makes refactoring harder.

Drop the special defines in a bunch of drivers where the
removal is relatively simple so grouping into one patch
does not impact reviewability.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2022-04-29 11:56:41 +01:00
Nathan Rossi
5da66099d6 net: dsa: mv88e6xxx: Single chip mode detection for MV88E6*41
The mv88e6xxx driver expects switches that are configured in single chip
addressing mode to have the MDIO address configured as 0. This is due to
the switch ADDR pins representing the single chip addressing mode as 0.
However depending on the device (e.g. MV88E6*41) the switch does not
respond on address 0 or any other address below 16 (the first port
address) in single chip addressing mode. This allows for other devices
to be on the same shared MDIO bus despite the switch being in single
chip addressing mode.

When using a switch that works this way it is not possible to configure
switch driver as single chip addressing via device tree, along with
another MDIO device on the same bus with address 0, as both devices
would have the same address of 0 resulting in mdiobus_register_device
-EBUSY errors for one of the devices with address 0.

In order to support this configuration the switch node can have its MDIO
address configured as 16 (the first address that the device responds
to). During initialization the driver will treat this address similar to
how address 0 is, however because this address is also a valid
multi-chip address (in certain switch models, but not all) the driver
will configure the SMI in single chip addressing mode and attempt to
detect the switch model. If the device is configured in single chip
addressing mode this will succeed and the initialization process can
continue. If it fails to detect a valid model this is because the switch
model register is not a valid register when in multi-chip mode, it will
then fall back to the existing SMI initialization process using the MDIO
address as the multi-chip mode address.

This detection method is safe if the device is in either mode because
the single chip addressing mode read is a direct SMI/MDIO read operation
and has no side effects compared to the SMI writes required for the
multi-chip addressing mode.

In order to implement this change, the reset gpio configuration is moved
to occur before any SMI initialization. This ensures that the device has
the same/correct reset gpio state for both mv88e6xxx_smi_init calls.

Signed-off-by: Nathan Rossi <nathan@nathanrossi.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20220427130928.540007-1-nathan@nathanrossi.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28 18:38:09 -07:00
Volodymyr Mytnyk
dde2daa0a2 net: prestera: add police action support
- Add HW api to configure policer:
  - SR TCM policer mode is only supported for now.
  - Policer ingress/egress direction support.
- Add police action support into flower

Signed-off-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
Link: https://lore.kernel.org/r/1651061148-21321-1-git-send-email-volodymyr.mytnyk@plvision.eu
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28 18:37:55 -07:00
Lukas Wunner
07caad0bb1 net: phy: Deduplicate interrupt disablement on PHY attach
phy_attach_direct() first calls phy_init_hw() (which restores interrupt
settings through ->config_intr()), then calls phy_disable_interrupts().

So if phydev->interrupts was previously set to 1, interrupts are briefly
enabled, then disabled, which seems nonsensical.

If it was previously set to 0, interrupts are disabled twice, which is
equally nonsensical.

Deduplicate interrupt disablement.

Signed-off-by: Lukas Wunner <lukas@wunner.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/805ccdc606bd8898d59931bd4c7c68537ed6e550.1651040826.git.lukas@wunner.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28 18:37:48 -07:00
Erin MacNeil
6fd1d51cfa net: SO_RCVMARK socket option for SO_MARK with recvmsg()
Adding a new socket option, SO_RCVMARK, to indicate that SO_MARK
should be included in the ancillary data returned by recvmsg().

Renamed the sock_recv_ts_and_drops() function to sock_recv_cmsgs().

Signed-off-by: Erin MacNeil <lnx.erin@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Acked-by: Marc Kleine-Budde <mkl@pengutronix.de>
Link: https://lore.kernel.org/r/20220427200259.2564-1-lnx.erin@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28 13:08:15 -07:00
Jakub Kicinski
0e55546b18 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
include/linux/netdevice.h
net/core/dev.c
  6510ea973d ("net: Use this_cpu_inc() to increment net->core_stats")
  794c24e992 ("net-core: rx_otherhost_dropped to core_stats")
https://lore.kernel.org/all/20220428111903.5f4304e0@canb.auug.org.au/

drivers/net/wan/cosa.c
  d48fea8401 ("net: cosa: fix error check return value of register_chrdev()")
  89fbca3307 ("net: wan: remove support for COSA and SRP synchronous serial boards")
https://lore.kernel.org/all/20220428112130.1f689e5e@canb.auug.org.au/

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2022-04-28 13:02:01 -07:00
Linus Torvalds
249aca0d3d Networking fixes for 5.18-rc5, including fixes from bluetooth, bpf
and netfilter.
 
 Current release - new code bugs:
 
  - bridge: switchdev: check br_vlan_group() return value
 
  - use this_cpu_inc() to increment net->core_stats, fix preempt-rt
 
 Previous releases - regressions:
 
  - eth: stmmac: fix write to sgmii_adapter_base
 
 Previous releases - always broken:
 
  - netfilter: nf_conntrack_tcp: re-init for syn packets only,
    resolving issues with TCP fastopen
 
  - tcp: md5: fix incorrect tcp_header_len for incoming connections
 
  - tcp: fix F-RTO may not work correctly when receiving DSACK
 
  - tcp: ensure use of most recently sent skb when filling rate samples
 
  - tcp: fix potential xmit stalls caused by TCP_NOTSENT_LOWAT
 
  - virtio_net: fix wrong buf address calculation when using xdp
 
  - xsk: fix forwarding when combining copy mode with busy poll
 
  - xsk: fix possible crash when multiple sockets are created
 
  - bpf: lwt: fix crash when using bpf_skb_set_tunnel_key() from
    bpf_xmit lwt hook
 
  - sctp: null-check asoc strreset_chunk in sctp_generate_reconf_event
 
  - wireguard: device: check for metadata_dst with skb_valid_dst()
 
  - netfilter: update ip6_route_me_harder to consider L3 domain
 
  - gre: make o_seqno start from 0 in native mode
 
  - gre: switch o_seqno to atomic to prevent races in collect_md mode
 
 Misc:
 
  - add Eric Dumazet to networking maintainers
 
  - dt: dsa: realtek: remove realtek,rtl8367s string
 
  - netfilter: flowtable: Remove the empty file
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmJq2r4ACgkQMUZtbf5S
 IrthIxAAjGEcLr25lkB0IWcjOD5wqOuhaRKeSWXnbm5bPkWIaxVMsssBAR8DS78S
 bsaJ0yTKQqv4vLtlMjtQpVC/azr0NTmsb0y5+6C5d4IObBf2Mv1LPpkiqs0d+phQ
 khPsCh0QGtSJT9VbaMu5+JW+c6Jo0kVmnmOgmhZMo1QKFw/bocdJrxQoZjcve9/X
 /uTDFEn8dPWbQKOm1TmSvQhuEPh1V6ZAf8/cBikN2Yul1R0EYbZO6RKSfrgqaa7T
 aRMMTEwRoTEuRdjF97F7ZgGXhoRxP9rW+bdzoQ0ewRu+KgKKPjCL2eVgeNfTNptj
 FjaUpNrImkYUxQ8+x7sQVPdwFaScVVtjHFfqFl0CsvhddT6nQw2trElccfy3/16K
 0GEKBXKCB3B9h02fhillPFveZzDChy/5NTezARqMYP0eG5SBUHLCCymZnqnoNkwV
 hdcmciZTnJzIxPJcmlp8F7D5etueDOwh03nMcFxRf3eW0IcC+Hl6qtbOFUzr6KhB
 V/TLh7N+Smy3JtsavU9aj4iSQGR+kChCt5zhH9idkuEsUgdYo3apB4q3k3ShzWqM
 SJx4gxp5HxwobLx+uW/HJMJTmwA8fApMbIl0OOPm+qiHfULufCZb6BS3AZGb7jdX
 wrcEPeZxBTCZqHeQc0kuo9/CtTTvbawJYtP5LRiZ5bWfoKn5F9o=
 =XOSt
 -----END PGP SIGNATURE-----

Merge tag 'net-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Including fixes from bluetooth, bpf and netfilter.

  Current release - new code bugs:

   - bridge: switchdev: check br_vlan_group() return value

   - use this_cpu_inc() to increment net->core_stats, fix preempt-rt

  Previous releases - regressions:

   - eth: stmmac: fix write to sgmii_adapter_base

  Previous releases - always broken:

   - netfilter: nf_conntrack_tcp: re-init for syn packets only,
     resolving issues with TCP fastopen

   - tcp: md5: fix incorrect tcp_header_len for incoming connections

   - tcp: fix F-RTO may not work correctly when receiving DSACK

   - tcp: ensure use of most recently sent skb when filling rate samples

   - tcp: fix potential xmit stalls caused by TCP_NOTSENT_LOWAT

   - virtio_net: fix wrong buf address calculation when using xdp

   - xsk: fix forwarding when combining copy mode with busy poll

   - xsk: fix possible crash when multiple sockets are created

   - bpf: lwt: fix crash when using bpf_skb_set_tunnel_key() from
     bpf_xmit lwt hook

   - sctp: null-check asoc strreset_chunk in sctp_generate_reconf_event

   - wireguard: device: check for metadata_dst with skb_valid_dst()

   - netfilter: update ip6_route_me_harder to consider L3 domain

   - gre: make o_seqno start from 0 in native mode

   - gre: switch o_seqno to atomic to prevent races in collect_md mode

  Misc:

   - add Eric Dumazet to networking maintainers

   - dt: dsa: realtek: remove realtek,rtl8367s string

   - netfilter: flowtable: Remove the empty file"

* tag 'net-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (65 commits)
  tcp: fix F-RTO may not work correctly when receiving DSACK
  Revert "ibmvnic: Add ethtool private flag for driver-defined queue limits"
  net: enetc: allow tc-etf offload even with NETIF_F_CSUM_MASK
  ixgbe: ensure IPsec VF<->PF compatibility
  MAINTAINERS: Update BNXT entry with firmware files
  netfilter: nft_socket: only do sk lookups when indev is available
  net: fec: add missing of_node_put() in fec_enet_init_stop_mode()
  bnx2x: fix napi API usage sequence
  tls: Skip tls_append_frag on zero copy size
  Add Eric Dumazet to networking maintainers
  netfilter: conntrack: fix udp offload timeout sysctl
  netfilter: nf_conntrack_tcp: re-init for syn packets only
  net: dsa: lantiq_gswip: Don't set GSWIP_MII_CFG_RMII_CLK
  net: Use this_cpu_inc() to increment net->core_stats
  Bluetooth: hci_sync: Cleanup hci_conn if it cannot be aborted
  Bluetooth: hci_event: Fix creating hci_conn object on error status
  Bluetooth: hci_event: Fix checking for invalid handle on error status
  ice: fix use-after-free when deinitializing mailbox snapshot
  ice: wait 5 s for EMP reset after firmware flash
  ice: Protect vf_state check by cfg_lock in ice_vc_process_vf_msg()
  ...
2022-04-28 12:34:50 -07:00
Linus Torvalds
3c76fe7436 Thermal control fixes for 5.18-rc5
- Stop warning about deprecation of the userspace thermal governor
    and cooling device status interface, because there are cases in
    which user space has to drive thermal management with the help
    of them (Daniel Lezcano).
 
  - Fix attr.show callback prototype in the int340x thermal driver (Kees
    Cook).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmJqsGQSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxzqUQALXZIg4vp/h1VPqgz8fio1PWw4YjuBdk
 JPSSbl/go+9/yijswuCeA9YfISbQTuJzFBi91QVtfb+ZNy6N7inlZATp3fcmtvHS
 TFRMPtQZsHRYg5A+eOpHRYj5ct6Y0xdbwLoqAlXg9PPvnx5cWiNhafXMxy5dnU66
 sWsa75YqqsKXdedLiydbWQV4hRXzIzyvdBC9HdNd0TJKu8C/88RQKmzGKu+8WJ8v
 9wFF9hJLDZcRGt377DopJCowrc/wm889aFyA2niSbw8Fq3+cuEOUolUsjcE6NYcA
 PsE2w1uSQKrAqoXMQanCJN0NcmgAQ8oqho/jYUNP2TdpjzNO24u5dlb2MLAIHb0P
 WYc52MrzPK/9UjLilqSUylsg6wXCxB0gepK9yulEDH0i20Ybx24VjTlXf9NLepzY
 MiHTWJlpQe07S9j0sUIxyDD95uOMUHWZV8YfR1eBAH91OfV9u6kaD1YAaHhwBzU8
 aMK0dBR0GHycl6eh7qWBxytDpmlPIOLZW2wmO3bwsOo3XpYqqTqpXng4IUGlUu/N
 WQzVx3eOc62B05UPkj/bbMWGbu0lDzGYz0w4zjyrWl4ELx9ASgt1DU7b7ecXE9NM
 U4dIkyyvug7ppRNW/l68giNpSxhdBofjLLX3Cm+ASuG7xuUmc9tqFZAu817T2VzI
 yY8V4PfE95ZM
 =u3GS
 -----END PGP SIGNATURE-----

Merge tag 'thermal-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull thermal control fixes from Rafael Wysocki:
 "These take back recent chages that started to confuse users and fix up
  an attr.show callback prototype in a driver.

  Specifics:

   - Stop warning about deprecation of the userspace thermal governor
     and cooling device status interface, because there are cases in
     which user space has to drive thermal management with the help of
     them (Daniel Lezcano)

   - Fix attr.show callback prototype in the int340x thermal driver
     (Kees Cook)"

* tag 'thermal-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  thermal/governor: Remove deprecated information
  Revert "thermal/core: Deprecate changing cooling device state from userspace"
  thermal: int340x: Fix attr.show callback prototype
2022-04-28 11:57:00 -07:00
Linus Torvalds
659ed6e285 Power management updates for 5.18-rc5
- Fix issues with the Qualcomm's cpufreq driver (Dmitry Baryshkov,
    Vladimir Zapolskiy).
 
  - Fix memory leak with the Sun501 driver (Xiaobing Luo).
 
  - Make intel_idle enable C1E promotion on all CPUs when C1E is
    preferred to C1 (Artem Bityutskiy).
 
  - Make C6 optimization on Sapphire Rapids added recently work as
    expected if both C1E and C1 are "preferred" (Artem Bityutskiy).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmJqr7cSHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxlv0P/Rf3L00cpl+HGoUQ2yNb/7FDPu2hXyt2
 V1b3XVHxKlDfN6azKChdnJlczYw7/C/uGc6azjblZ2nN362jkYiWGRn8D7tX+nvy
 7NyajdZLRSOx+AchWOSpDjrOkq6TVcLnZLGLMYEX02JREKZmOqimfCNEKjfhO7Ht
 FwapPSqTAMdg7YmEhpUCkigUJCo6d436ZxMf6T02LHXLUTTN6qqBcEqHh81AWgvd
 LZbz1WC1srj0YZePalmMl6ToyO2p8qQjMIQZizIthvFOkhj58af2H687oU2dCNtp
 YoLctD7+ge14nhssnE3VD4cmOY4rzlK15e0EqPIflbuQcux2n5qNN3rLyJ0bqHSM
 zQUHy7cbBgnYMbhM7/FKXza51nsxWh5cSV0CQisv0Pck+Mj9PqAJssM6HhhmnKgz
 tyQvT742p98XH3pEq2RCLoJ01qe2ptl6/+u2HN36KA9zfPPAOY7eIvrZvliQ2Qgf
 R7V29ZxrT2gDkKhF71pKH1qcSY8/41x2uesKFJzFunj0+IBVFhKNECvqEgJuUbCo
 nnr4woxYfSZuDQFsf3/NC7OoM5z+SdeaPu0Y8f594wORL6pMHaTuo/8UwJEx2aWQ
 OxlhYeDwk0Zd/A7pdFqbFomX+PHU0Ah2UWNX/aaDybiBFm8LcFuIVGwJwy9r1HX1
 VjAsHV8bVaPo
 =b1TP
 -----END PGP SIGNATURE-----

Merge tag 'pm-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull power management fixes from Rafael Wysocki:
 "These fix up recent intel_idle driver changes and fix some ARM cpufreq
  driver issues.

  Specifics:

   - Fix issues with the Qualcomm's cpufreq driver (Dmitry Baryshkov,
     Vladimir Zapolskiy).

   - Fix memory leak with the Sun501 driver (Xiaobing Luo).

   - Make intel_idle enable C1E promotion on all CPUs when C1E is
     preferred to C1 (Artem Bityutskiy).

   - Make C6 optimization on Sapphire Rapids added recently work as
     expected if both C1E and C1 are "preferred" (Artem Bityutskiy)"

* tag 'pm-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  intel_idle: Fix SPR C6 optimization
  intel_idle: Fix the 'preferred_cstates' module parameter
  cpufreq: qcom-cpufreq-hw: Clear dcvs interrupts
  cpufreq: fix memory leak in sun50i_cpufreq_nvmem_probe
  cpufreq: qcom-cpufreq-hw: Fix throttle frequency value on EPSS platforms
  cpufreq: qcom-hw: provide online/offline operations
  cpufreq: qcom-hw: fix the opp entries refcounting
  cpufreq: qcom-hw: fix the race between LMH worker and cpuhp
  cpufreq: qcom-hw: drop affinity hint before freeing the IRQ
2022-04-28 11:50:21 -07:00
Linus Torvalds
f12d31c00b ACPI updates for 5.18-rc5
- Make the ACPI processor driver avoid falling back to C3 type of
    C-states when C3 cannot be requested (Ville Syrjälä).
 
  - Revert a quirk that is not necessary any more after fixing the
    underlying issue properly (Ville Syrjälä).
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEE4fcc61cGeeHD/fCwgsRv/nhiVHEFAmJqrTISHHJqd0Byand5
 c29ja2kubmV0AAoJEILEb/54YlRxQroQAKUTO482hWPQI5yj9WNB4GZBqYb9LrpI
 pX2chljJJ8STzJZEbLdzxZHE0KbqO/v0v1vUIUEAD3xIOkC0ObRGSz4yp5yKdLN6
 EvPgWONqcc/w41LXbzJzDgE53M+Gu//BGOBdNXfBq0MW3y0n5bbxlXRu4KdzYIsp
 Rtvzquxu9jTk+esjrImyS0u4xwF+tInocv2BA3gFax7F+eJZTeVpzbHvq5OjO7D7
 Gnlpv+YrP3tBCkAG7OO3Stm2CjXmEg4GdnSqFjT99JoyAn2uF4+vrtXigc/yi/CW
 bMG/czdBIk3LGU3l741LN41qK1wpVlrsafPWd/SZC2vY4sl10QX6fM2xbkTHXCMd
 y8p+HKn1Me50tXeUmqas1TLYHwefRxAVUIqHP3BC62XD4ZAiCtbBZAXeNnt4Ve4f
 iUq8z7OX6izllrsOXoyxANMc3xI7WpBCa/oa5UjjB8qciT1OaYKpw871g+J16qXv
 Bb8/bhEmrCxIgksF0dmsbCfFdit41+ylV2j/Eh0pKgg1pfO4GGwfRiWTVZinHwrw
 Px1y7DMGBiu7YenTNDj3qfoYzjfwgsRRT6uWopW7gK9f4CNLhFwyEpuNqmumBlYN
 fev2EwIWINMrs/s+irN68FZrbmtaEGsy9JORVQkrbWiIyhMPG76D0p+ID0+CfcLA
 sKQwxXBkwFog
 =NnhZ
 -----END PGP SIGNATURE-----

Merge tag 'acpi-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm

Pull ACPI fixes from Rafael WysockiL
 "These fix up the ACPI processor driver after a change made during the
  5.16 cycle that inadvertently broke falling back to shallower C-states
  when C3 cannot be used.

  Specifics:

   - Make the ACPI processor driver avoid falling back to C3 type of
     C-states when C3 cannot be requested (Ville Syrjälä)

   - Revert a quirk that is not necessary any more after fixing the
     underlying issue properly (Ville Syrjälä)"

* tag 'acpi-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
  Revert "ACPI: processor: idle: fix lockup regression on 32-bit ThinkPad T40"
  ACPI: processor: idle: Avoid falling back to C3 type C-states
2022-04-28 11:37:20 -07:00