1252797 Commits

Author SHA1 Message Date
Jakub Kicinski
4e887471e8 tools: ynl: rename make hardclean -> distclean
The make target to remove all generated files used to be called
"hardclean" because it deleted files which were tracked by git.
We no longer track generated user space files, so use the more
common "distclean" name.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Donald Hunter <donald.hunter@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 12:05:10 +00:00
David S. Miller
db72b6fc8f Merge branch '100GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
Tony Nguyen says:

====================
Intel Wired LAN Driver Updates 2024-03-04 (ice)

This series contains updates to ice driver only.

Jake changes the driver to use relative VSI index for VF VSIs as the VF
driver has no direct use of the VSI number on ice hardware. He also
reworks some Tx/Rx functions to clarify their uses, cleans up some style
issues, and utilizes kernel helper functions.

Maciej removes a redundant call to disable Tx queues on ifdown and
removes some unnecessary devm usages.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 11:29:19 +00:00
David S. Miller
39a096d67c Merge branch 'ravb-cleanups'
Niklas Söderlund says:

====================
ravb: Align Rx descriptor setup and maintenance

When RZ/G2L support was added the Rx code path was split in two, one to
support R-Car and one to support RZ/G2L. One reason for this is that
R-Car uses the extended Rx descriptor format, while RZ/G2L uses the
normal descriptor format.

In many aspects this is not needed as the extended descriptor format is
just a normal descriptor with extra metadata (timestamsp) appended. And
the R-Car SoCs can also use normal descriptors if hardware timestamps
were not desired. This split has led to RZ/G2L gaining support for
split descriptors in the Rx path while R-Car still lacks this.

This series is the first step in trying to merge the R-Car and RZ/G2L Rx
paths so features and bugs corrected in one will benefit the other.

The first patch in the series clarifies that the driver now supports
either normal or extended descriptors, not both at the same time by
grouping them in a union. This is the foundation that later patches will
build on the aligning the two Rx paths.

Patches 2-5 deals with correcting small issues in the Rx frame and
descriptor sizes that either were incorrect at the time they were added
in 2017 (my bad) or concepts built on-top of this initial incorrect
design.

While finally patch 6 merges the R-Car and RZ/G2L for Rx descriptor
setup and maintenance.

When this work has landed I plan to follow up with more work aligning
the rest of the Rx code paths and hopefully bring split descriptor
support to the R-Car SoCs.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 11:23:21 +00:00
Niklas Söderlund
644d037b2c ravb: Unify Rx ring maintenance code paths
The R-Car and RZ/G2L Rx code paths were split in two separate
implementations when support for RZ/G2L was added due to the fact that
R-Car uses the extended descriptor format while RZ/G2L uses normal
descriptors. This has led to a duplication of Rx logic with the only
difference being the different Rx descriptors types used. The
implementation however neglects to take into account that extended
descriptors are normal descriptors with additional metadata at the end
to carry hardware timestamp information.

The hardware timestamp information is only consumed in the R-Car Rx
loop and all the maintenance code around the Rx ring can be shared
between the two implementations if the difference in descriptor length
is carefully considered.

This change merges the two implementations for Rx ring maintenance by
adding a method to access both types of descriptors as normal
descriptors, as this part covers all the fields needed for Rx ring
maintenance the only difference between using normal or extended
descriptor is the size of the memory region to allocate/free and the
step size between each descriptor in the ring.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 11:23:21 +00:00
Niklas Söderlund
555419b225 ravb: Move maximum Rx descriptor data usage to info struct
To make it possible to merge the R-Car and RZ/G2L code paths move the
maximum usable size of a single Rx descriptor data slice into the
hardware information instead of using two different defines in the two
different code paths.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 11:23:21 +00:00
Niklas Söderlund
4968633881 ravb: Use the max frame size from hardware info for RZ/G2L
Remove the define describing the RZ/G2L maximum frame size and only use
the information in the hardware information struct. This will make it
easier to merge the R-Car and RZ/G2L code paths.

There is no functional change as both the define and the maximum frame
length in the hardware information is set to 8K.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 11:23:21 +00:00
Niklas Söderlund
cfbad64706 ravb: Create helper to allocate skb and align it
The EtherAVB device requires the SKB data to be aligned to 128 bytes.
The alignment is done by allocating an skb 128 bytes larger than the
maximum frame size supported by the device and adjusting the headroom to
fit the requirement.

This code has been refactored a few times and small issues have been
added along the way. The issues are not harmful but prevent merging
parts of the Rx code which have been split in two implementations with
the addition of RZ/G2L support, a device that supports larger frame
sizes.

This change removes the need for duplicated and somewhat inaccurate
hardware alignment constrains stored in the hardware information struct
by creating a helper to handle the allocation of an skb and alignment of
an skb data.

For the R-Car class of devices the maximum frame size is 4K and each
descriptor is limited to 2K of data. The current implementation does not
support split descriptors, this limits the frame size to 2K. The
current hardware information however records the descriptor size just
under 2K due to bad understanding of the device when larger MTUs where
added.

For the RZ/G2L device the maximum frame size is 8K and each descriptor
is limited to 4K of data. The current hardware information records this
correctly, but it gets the alignment constrains wrong as just aligns it
by 128, it does not extend it by 128 bytes to allow the full frame to be
stored. This works because the RZ/G2L device supports split descriptors
and allocates each skb to 8K and aligns each 4K descriptor in this
space.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 11:23:21 +00:00
Niklas Söderlund
e82700b866 ravb: Make it clear the information relates to maximum frame size
The struct member rx_max_buf_size was added before split descriptor
support was added. It is unclear if the value describes the full skb
frame buffer or the data descriptor buffer which can be combined into a
single skb.

Rename it to make it clear it referees to the maximum frame size and can
cover multiple descriptors.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 11:23:21 +00:00
Niklas Söderlund
4123c3fbf8 ravb: Group descriptor types used in Rx ring
The Rx ring can either be made up of normal or extended descriptors, not
a mix of the two at the same time. Make this explicit by grouping the
two variables in a rx_ring union.

The extension of the storage for more than one queue of normal
descriptors from a single to NUM_RX_QUEUE queues have no practical
effect. But aids in making the code readable as the code that uses it
already piggyback on other members of struct ravb_private that are
arrays of max length NUM_RX_QUEUE, e.g. rx_desc_dma. This will also make
further refactoring easier.

While at it, rename the normal descriptor Rx ring to make it clear it's
not strictly related to the GbEthernet E-MAC IP found in RZ/G2L, normal
descriptors could be used on R-Car SoCs too.

Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>
Reviewed-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 11:23:21 +00:00
David S. Miller
dbb0b6ca7d Merge branch '200GbE' of git://git.kernel.org/pub/scm/linux/kernel/git/tnguy/next-queue
From: Tony Nguyen <anthony.l.nguyen@intel.com>
To: davem@davemloft.net, kuba@kernel.org, pabeni@redhat.com,
	edumazet@google.com, netdev@vger.kernel.org
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>, alan.brady@intel.com
Tony Nguyen says:

====================
idpf: refactor virtchnl messages

Alan Brady says:

The motivation for this series has two primary goals. We want to enable
support of multiple simultaneous messages and make the channel more
robust. The way it works right now, the driver can only send and receive
a single message at a time and if something goes really wrong, it can
lead to data corruption and strange bugs.

To start the series, we introduce an idpf_virtchnl.h file. This reduces
the burden on idpf.h which is overloaded with struct and function
declarations.

The conversion works by conceptualizing a send and receive as a
"virtchnl transaction" (idpf_vc_xn) and introducing a "transaction
manager" (idpf_vc_xn_manager). The vcxn_mngr will init a ring of
transactions from which the driver will pop from a bitmap of free
transactions to track in-flight messages. Instead of needing to handle a
complicated send/recv for every a message, the driver now just needs to
fill out a xn_params struct and hand it over to idpf_vc_xn_exec which
will take care of all the messy bits. Once a message is sent and
receives a reply, we leverage the completion API to signal the received
buffer is ready to be used (assuming success, or an error code
otherwise).

At a low-level, this implements the "sw cookie" field of the virtchnl
message descriptor to enable this. We have 16 bits we can put whatever
we want and the recipient is required to apply the same cookie to the
reply for that message.  We use the first 8 bits as an index into the
array of transactions to enable fast lookups and we use the second 8
bits as a salt to make sure each cookie is unique for that message. As
transactions are received in arbitrary order, it's possible to reuse a
transaction index and the salt guards against index conflicts to make
certain the lookup is correct. As a primitive example, say index 1 is
used with salt 1. The message times out without receiving a reply so
index 1 is renewed to be ready for a new transaction, we report the
timeout, and send the message again. Since index 1 is free to be used
again now, index 1 is again sent but now salt is 2. This time we do get
a reply, however it could be that the reply is _actually_ for the
previous send index 1 with salt 1.  Without the salt we would have no
way of knowing for sure if it's the correct reply, but with we will know
for certain.

Through this conversion we also get several other benefits. We can now
more appropriately handle asynchronously sent messages by providing
space for a callback to be defined. This notably allows us to handle MAC
filter failures better; previously we could potentially have stale,
failed filters in our list, which shouldn't really have a major impact
but is obviously not correct. I also managed to remove fairly
significant more lines than I added which is a win in my book.

Additionally, this converts some variables to use auto-variables where
appropriate. This makes the alloc paths much cleaner and less prone to
memory leaks. We also fix a few virtchnl related bugs while we're here.

====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 10:30:08 +00:00
David S. Miller
784ee615af Merge branch 'netlink-emsgsize'
Jakub Kicinski says:

====================
netlink: handle EMSGSIZE errors in the core

Ido discovered some time back that we usually force NLMSG_DONE
to be delivered in a separate recv() syscall, even if it would
fit into the same skb as data messages. He made nexthop try
to fit DONE with data in commit 8743aeff5bc4 ("nexthop: Fix
infinite nexthop bucket dump when using maximum nexthop ID"),
and nobody has complained so far.

We have since also tried to follow the same pattern in new
genetlink families, but explaining to people, or even remembering
the correct handling ourselves is tedious.

Let the netlink socket layer consume -EMSGSIZE errors.
Practically speaking most families use this error code
as "dump needs more space", anyway.

v2:
 - init err to 0 in last patch
v1: https://lore.kernel.org/all/20240301012845.2951053-1-kuba@kernel.org/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 08:07:45 +00:00
Jakub Kicinski
87d381973e genetlink: fit NLMSG_DONE into same read() as families
Make sure ctrl_fill_info() returns sensible error codes and
propagate them out to netlink core. Let netlink core decide
when to return skb->len and when to treat the exit as an
error. Netlink core does better job at it, if we always
return skb->len the core doesn't know when we're done
dumping and NLMSG_DONE ends up in a separate read().

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 08:07:45 +00:00
Jakub Kicinski
0b11b1c5c3 netdev: let netlink core handle -EMSGSIZE errors
Previous change added -EMSGSIZE handling to af_netlink, we don't
have to hide these errors any longer.

Theoretically the error handling changes from:
 if (err == -EMSGSIZE)
to
 if (err == -EMSGSIZE && skb->len)

everywhere, but in practice it doesn't matter.
All messages fit into NLMSG_GOODSIZE, so overflow of an empty
skb cannot happen.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 08:07:44 +00:00
Jakub Kicinski
b5a899154a netlink: handle EMSGSIZE errors in the core
Eric points out that our current suggested way of handling
EMSGSIZE errors ((err == -EMSGSIZE) ? skb->len : err) will
break if we didn't fit even a single object into the buffer
provided by the user. This should not happen for well behaved
applications, but we can fix that, and free netlink families
from dealing with that completely by moving error handling
into the core.

Let's assume from now on that all EMSGSIZE errors in dumps are
because we run out of skb space. Families can now propagate
the error nla_put_*() etc generated and not worry about any
return value magic. If some family really wants to send EMSGSIZE
to user space, assuming it generates the same error on the next
dump iteration the skb->len should be 0, and user space should
still see the EMSGSIZE.

This should simplify families and prevent mistakes in return
values which lead to DONE being forced into a separate recv()
call as discovered by Ido some time ago.

Reviewed-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-03-06 08:07:44 +00:00
Jakub Kicinski
e3350ba4a5 selftests: avoid using SKIP(exit()) in harness fixure setup
selftest harness uses various exit codes to signal test
results. Avoid calling exit() directly, otherwise tests
may get broken by harness refactoring (like the commit
under Fixes). SKIP() will instruct the harness that the
test shouldn't run, it used to not be the case, but that
has been fixed. So just return, no need to exit.

Note that for hmm-tests this actually changes the result
from pass to skip. Which seems fair, the test is skipped,
after all.

Reported-by: Mark Brown <broonie@kernel.org>
Link: https://lore.kernel.org/all/05f7bf89-04a5-4b65-bf59-c19456aeb1f0@sirena.org.uk
Fixes: a724707976b0 ("selftests: kselftest_harness: use KSFT_* exit codes")
Reviewed-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Mark Brown <broonie@kernel.org>
Tested-by: Mark Brown <broonie@kernel.org>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://lore.kernel.org/r/20240304233621.646054-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 19:25:36 -08:00
Jakub Kicinski
6d0f77a0e3 Merge branch 'net-ethernet-rework-eee'
Oleksij Rempel says:

====================
net: ethernet: Rework EEE

with Andrew's permission I'll continue mainlining this patches:
==============================================================

Most MAC drivers get EEE wrong. The API to the PHY is not very
obvious, which is probably why. Rework the API, pushing most of the
EEE handling into phylib core, leaving the MAC drivers to just
enable/disable support for EEE in there change_link call back.

MAC drivers are now expect to indicate to phylib if they support
EEE. This will allow future patches to configure the PHY to advertise
no EEE link modes when EEE is not supported. The information could
also be used to enable SmartEEE if the PHY supports it.

With these changes, the uAPI configuration eee_enable becomes a global
on/off. tx-lpi must also be enabled before EEE is enabled. This fits
the discussion here:

https://lore.kernel.org/netdev/af880ce8-a7b8-138e-1ab9-8c89e662eecf@gmail.com/T/

This patchset puts in place all the infrastructure, and converts one
MAC driver to the new API. Following patchsets will convert other MAC
drivers, extend support into phylink, and when all MAC drivers are
converted to the new scheme, clean up some unneeded code.
====================

Link: https://lore.kernel.org/r/20240302195306.3207716-1-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 19:21:20 -08:00
Andrew Lunn
6a2495adc0 net: fec: Fixup EEE
The enabling/disabling of EEE in the MAC should happen as a result of
auto negotiation. So move the enable/disable into
fec_enet_adjust_link() which gets called by phylib when there is a
change in link status.

fec_enet_set_eee() now just stores away the LPI timer value.
Everything else is passed to phylib, so it can correctly setup the
PHY.

fec_enet_get_eee() relies on phylib doing most of the work,
the MAC driver just adds the LPI timer value.

Call phy_support_eee() if the quirk is present to indicate the MAC
actually supports EEE.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Tested-by: Oleksij Rempel <o.rempel@pengutronix.de> (On iMX8MP debix)
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://lore.kernel.org/r/20240302195306.3207716-8-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 19:21:17 -08:00
Andrew Lunn
aff1b8c84b net: fec: Move fec_enet_eee_mode_set() and helper earlier
FEC is about to get its EEE code re-written. To allow this, move
fec_enet_eee_mode_set() before fec_enet_adjust_link() which will
need to call it.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Wei Fang <wei.fang@nxp.com>
Link: https://lore.kernel.org/r/20240302195306.3207716-7-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 19:21:17 -08:00
Andrew Lunn
49168d1980 net: phy: Add phy_support_eee() indicating MAC support EEE
In order for EEE to operate, both the MAC and the PHY need to support
it, similar to how pause works. With some exception - a number of PHYs
have SmartEEE or AutoGrEEEn support in order to provide some EEE-like
power savings with non-EEE capable MACs.

Copy the pause concept and add the call phy_support_eee() which the MAC
makes after connecting the PHY to indicate it supports EEE. phylib will
then advertise EEE when auto-neg is performed.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20240302195306.3207716-6-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 19:21:17 -08:00
Andrew Lunn
3e43b903da net: phy: Immediately call adjust_link if only tx_lpi_enabled changes
The MAC driver changes its EEE hardware configuration in its
adjust_link callback. This is called when auto-neg
completes. Disabling EEE via eee_enabled false will trigger an
autoneg, and as a result the adjust_link callback will be called with
phydev->enable_tx_lpi set to false. Similarly, eee_enabled set to true
and with a change of advertised link modes will result in a new
autoneg, and a call the adjust_link call.

If set_eee is called with only a change to tx_lpi_enabled which does
not trigger an auto-neg, it is necessary to call the adjust_link
callback so that the MAC is reconfigured to take this change into
account.

When setting phydev->enable_tx_lpi, take both eee_enabled and
tx_lpi_enabled into account, so the MAC drivers just needs to act on
phydev->enable_tx_lpi and not the whole EEE configuration.
The same check should be done for tx_lpi_timer too.

Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240302195306.3207716-5-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 19:21:17 -08:00
Andrew Lunn
fe0d4fd928 net: phy: Keep track of EEE configuration
Have phylib keep track of the EEE configuration. This simplifies the
MAC drivers, in that they don't need to store it.

Future patches to phylib will also make use of this information to
further simplify the MAC drivers.

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20240302195306.3207716-4-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 19:21:17 -08:00
Andrew Lunn
e3b6876ab8 net: phy: Add phydev->enable_tx_lpi to simplify adjust link callbacks
MAC drivers which support EEE need to know the results of the EEE
auto-neg in order to program the hardware to perform EEE or not.  The
oddly named phy_init_eee() can be used to determine this, it returns 0
if EEE should be used, or a negative error code,
e.g. -EOPPROTONOTSUPPORT if the PHY does not support EEE or negotiate
resulted in it not being used.

However, many MAC drivers get this wrong. Add phydev->enable_tx_lpi
which indicates the result of the autoneg for EEE, including if EEE is
administratively disabled with ethtool. The MAC driver can then access
this in the same way as link speed and duplex in the adjust link
callback. If enable_tx_lpi is true, the MAC should send low power
indications and does not need to consider anything else with respect
to EEE.

Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20240302195306.3207716-3-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 19:21:17 -08:00
Russell King
6f2fc8584a net: add helpers for EEE configuration
Add helpers that phylib and phylink can use to manage EEE configuration
and determine whether the MAC should be permitted to use LPI based on
that configuration.

Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Signed-off-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Florian Fainelli <florian.fainelli@broadcom.com>
Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
Link: https://lore.kernel.org/r/20240302195306.3207716-2-o.rempel@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 19:21:17 -08:00
Heiner Kallweit
344f7a4651 ethtool: ignore unused/unreliable fields in set_eee op
This function is used with the set_eee() ethtool operation. Certain
fields of struct ethtool_keee() are relevant only for the get_eee()
operation. In addition, in case of the ioctl interface, we have no
guarantee that userspace sends sane values in struct ethtool_eee.
Therefore explicitly ignore all fields not needed for set_eee().
This protects from drivers trying to use unchecked and unreliable
data, relying on specific userspace behavior.

Note: Such unsafe driver behavior has been found and fixed in the
tg3 driver.

Signed-off-by: Heiner Kallweit <hkallweit1@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/ad7ee11e-eb7a-4975-9122-547e13a161d8@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 19:07:13 -08:00
Kees Cook
ff73f8344e sock: Use unsafe_memcpy() for sock_copy()
While testing for places where zero-sized destinations were still showing
up in the kernel, sock_copy() and inet_reqsk_clone() were found, which
are using very specific memcpy() offsets for both avoiding a portion of
struct sock, and copying beyond the end of it (since struct sock is really
just a common header before the protocol-specific allocation). Instead
of trying to unravel this historical lack of container_of(), just switch
to unsafe_memcpy(), since that's effectively what was happening already
(memcpy() wasn't checking 0-sized destinations while the code base was
being converted away from fake flexible arrays).

Avoid the following false positive warning with future changes to
CONFIG_FORTIFY_SOURCE:

  memcpy: detected field-spanning write (size 3068) of destination "&nsk->__sk_common.skc_dontcopy_end" at net/core/sock.c:2057 (size 0)

Signed-off-by: Kees Cook <keescook@chromium.org>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240304212928.make.772-kees@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 18:35:12 -08:00
Breno Leitao
4166204d7e net: tap: Remove generic .ndo_get_stats64
Commit 3e2f544dd8a33 ("net: get stats64 if device if driver is
configured") moved the callback to dev_get_tstats64() to net core, so,
unless the driver is doing some custom stats collection, it does not
need to set .ndo_get_stats64.

Since this driver is now relying in NETDEV_PCPU_STAT_TSTATS, then, it
doesn't need to set the dev_get_tstats64() generic .ndo_get_stats64
function pointer.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240304183810.1474883-2-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 18:32:33 -08:00
Breno Leitao
46f480ec14 net: tuntap: Leverage core stats allocator
With commit 34d21de99cea9 ("net: Move {l,t,d}stats allocation to core and
convert veth & vrf"), stats allocation could be done on net core
instead of in this driver.

With this new approach, the driver doesn't have to bother with error
handling (allocation failure checking, making sure free happens in the
right spot, etc). This is core responsibility now.

Remove the allocation in the tun/tap driver and leverage the network
core allocation instead.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240304183810.1474883-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 18:32:33 -08:00
Uwe Kleine-König
60d06425e0 ptp: fc3: Convert to platform remove callback returning void
The .remove() callback for a platform driver returns an int which makes
many driver authors wrongly assume it's possible to do error handling by
returning an error code. However the value returned is ignored (apart
from emitting a warning) and this typically results in resource leaks.

To improve here there is a quest to make the remove callback return
void. In the first step of this quest all drivers are converted to
.remove_new(), which already returns void. Eventually after all drivers
are converted, .remove_new() will be renamed to .remove().

Trivially convert this driver from always returning zero in the remove
callback to the void returning variant.

Signed-off-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240304091325.717546-2-u.kleine-koenig@pengutronix.de
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 11:23:02 -08:00
Jakub Kicinski
73a42f4174 Merge branch 'net-constify-struct-class-usage'
Ricardo B. Marliere says:

====================
net: constify struct class usage

This is a simple and straight forward cleanup series that aims to make the
class structures in net constant. This has been possible since 2023 [1].

[1]: https://lore.kernel.org/all/2023040248-customary-release-4aec@gregkh/
====================

Link: https://lore.kernel.org/r/20240302-class_cleanup-net-next-v1-0-8fa378595b93@marliere.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 11:21:22 -08:00
Ricardo B. Marliere
e556001169 nfc: core: make nfc_class constant
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the nfc_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.

Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240302-class_cleanup-net-next-v1-6-8fa378595b93@marliere.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 11:21:18 -08:00
Ricardo B. Marliere
070bef83f0 net: wwan: core: make wwan_class constant
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the wwan_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.

Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Link: https://lore.kernel.org/r/20240302-class_cleanup-net-next-v1-5-8fa378595b93@marliere.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 11:21:18 -08:00
Ricardo B. Marliere
d9567f212b net: wwan: hwsim: make wwan_hwsim_class constant
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the wwan_hwsim_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.

Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Acked-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Link: https://lore.kernel.org/r/20240302-class_cleanup-net-next-v1-4-8fa378595b93@marliere.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 11:21:18 -08:00
Ricardo B. Marliere
2ad2018aa3 net: ppp: make ppp_class constant
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the ppp_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.

Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240302-class_cleanup-net-next-v1-3-8fa378595b93@marliere.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 11:21:18 -08:00
Ricardo B. Marliere
63767a7631 net: wan: framer: make framer_class constant
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the framer_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.

Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Acked-by: Herve Codina <herve.codina@bootlin.com>
Link: https://lore.kernel.org/r/20240302-class_cleanup-net-next-v1-2-8fa378595b93@marliere.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 11:21:17 -08:00
Ricardo B. Marliere
b6e3c115ef net: hns: make hnae_class constant
Since commit 43a7206b0963 ("driver core: class: make class_register() take
a const *"), the driver core allows for struct class to be in read-only
memory, so move the hnae_class structure to be declared at build time
placing it into read-only memory, instead of having to be dynamically
allocated at boot time.

Suggested-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Ricardo B. Marliere <ricardo@marliere.net>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240302-class_cleanup-net-next-v1-1-8fa378595b93@marliere.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 11:21:17 -08:00
Jakub Kicinski
e8efd372b2 Merge branch 'net-phy-micrel-lan8814-erratas'
Horatiu Vultur says:

====================
net: phy: micrel: lan8814 erratas

Add two erratas for lan8814. First one fix the led which might
stay on even that there is no link. The second one improves increases
length of the cable that can be used when used in 1000Base-T.
====================

Link: https://lore.kernel.org/r/20240304091548.1386022-1-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 11:17:56 -08:00
Horatiu Vultur
ad080db448 net: phy: micrel: lan8814 cable improvement errata
When the length of the cable is more than 100m and the lan8814 is
configured to run in 1000Base-T Slave then the register of the device
needs to be optimized.

Workaround this by setting the measure time to a value of 0xb. This
value can be set regardless of the configuration.

This issue is described in 'LAN8814 Silicon Errata and Data Sheet
Clarification' and according to that, this will not be corrected in a
future silicon revision.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Acked-by: Arun Ramadoss <arun.ramadoss@microchip.com>
Link: https://lore.kernel.org/r/20240304091548.1386022-3-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 11:17:46 -08:00
Horatiu Vultur
e9097f8e1e net: phy: micrel: lan8814 led errata
Lan8814 phy led behavior is not correct. It was noticed that the led
still remains ON when the cable is unplugged while there was traffic
passing at that time.

The fix consists in clearing bit 10 of register 0x38, in this way the
led behaviour is correct and gets OFF when there is no link.

Reviewed-by: Wojciech Drewek <wojciech.drewek@intel.com>
Signed-off-by: Horatiu Vultur <horatiu.vultur@microchip.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240304091548.1386022-2-horatiu.vultur@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 11:17:46 -08:00
Jakub Kicinski
5ad051235c Merge branch 'selftests-forwarding-various-improvements'
Ido Schimmel says:

====================
selftests: forwarding: Various improvements

This patchset speeds up the multipath tests (patches #1-#2) and makes
other tests more stable (patches #3-#6) so that they will not randomly
fail in the netdev CI.

On my system, after applying the first two patches, the run time of
gre_multipath_nh_res.sh is reduced by over 90%.
====================

Link: https://lore.kernel.org/r/20240304095612.462900-1-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 09:18:21 -08:00
Ido Schimmel
35df2ce896 selftests: forwarding: Make {, ip6}gre-inner-v6-multipath tests more robust
These tests generate various IPv6 flows, encapsulate them in GRE packets
and check that the encapsulated packets are distributed between the
available nexthops according to the configured weights.

Unlike the corresponding IPv4 tests, these tests sometimes fail in the
netdev CI because of large discrepancies between the expected and
measured ratios [1]. This can be explained by the fact that the IPv4
tests generate about 3,600 different flows whereas the IPv6 tests only
generate about 784 different flows (potentially by mistake).

Fix by aligning the IPv6 tests to the IPv4 ones and increase the number
of generated flows.

[1]
 [...]
 # TEST: ping                                                          [ OK ]
 # INFO: Running IPv6 over GRE over IPv4 multipath tests
 # TEST: ECMP                                                          [FAIL]
 # Too large discrepancy between expected and measured ratios
 # INFO: Expected ratio 1.00 Measured ratio 1.18
 [...]

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20240304095612.462900-7-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 09:18:17 -08:00
Ido Schimmel
f0008b0497 selftests: forwarding: Make VXLAN ECN encap tests more robust
These tests sometimes fail on the netdev CI because the expected number
of packets is larger than expected [1].

Make the tests more robust by specifically matching on VXLAN
encapsulated packets and allowing up to five stray packets instead of
just two.

[1]
 [...]
 # TEST: VXLAN: ECN encap: 0x00->0x00                                  [FAIL]
 # v1: Expected to capture 10 packets, got 13.
 # TEST: VXLAN: ECN encap: 0x01->0x01                                  [ OK ]
 # TEST: VXLAN: ECN encap: 0x02->0x02                                  [ OK ]
 # TEST: VXLAN: ECN encap: 0x03->0x02                                  [ OK ]
 [...]

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20240304095612.462900-6-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 09:18:13 -08:00
Ido Schimmel
dfbab74044 selftests: forwarding: Make vxlan-bridge-1q pass on debug kernels
The ageing time used by the test is too short for debug kernels and
results in entries being aged out prematurely [1].

Fix by increasing the ageing time.

[1]
 # ./vxlan_bridge_1q.sh
 [...]
 INFO: learning vlan 10
 TEST: VXLAN: flood before learning                                  [ OK ]
 TEST: VXLAN: show learned FDB entry                                 [ OK ]
 TEST: VXLAN: learned FDB entry                                      [FAIL]
         swp4: Expected to capture 0 packets, got 10.
 RTNETLINK answers: No such file or directory
 TEST: VXLAN: deletion of learned FDB entry                          [ OK ]
 TEST: VXLAN: Ageing of learned FDB entry                            [FAIL]
         swp4: Expected to capture 0 packets, got 10.
 TEST: VXLAN: learning toggling on bridge port                       [ OK ]
 [...]

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20240304095612.462900-5-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 09:18:10 -08:00
Ido Schimmel
4aca9eae6f selftests: forwarding: Make tc-police pass on debug kernels
The test configures a policer with a rate of 80Mbps and expects to
measure a rate close to it. This is a too high rate for debug kernels,
causing the test to fail [1].

Fix by reducing the rate to 10Mbps.

[1]
 # ./tc_police.sh
 TEST: police on rx                                                  [FAIL]
         Expected rate 76.2Mbps, got 29.6Mbps, which is -61% off. Required accuracy is +-10%.
 TEST: police on tx                                                  [FAIL]
         Expected rate 76.2Mbps, got 30.4Mbps, which is -60% off. Required accuracy is +-10%.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20240304095612.462900-4-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 09:18:07 -08:00
Ido Schimmel
748d27447d selftests: forwarding: Parametrize mausezahn delay
The various multipath tests use mausezahn to generate different flows
and check how they are distributed between the available nexthops. The
tool is currently invoked with an hard coded transmission delay of 1 ms.
This is unnecessary when the tests are run with veth pairs and
needlessly prolongs the tests.

Parametrize this delay and default it to 0 us. It can be overridden
using the forwarding.config file. On my system, this reduces the run
time of router_multipath.sh by 93%.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20240304095612.462900-3-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 09:18:04 -08:00
Ido Schimmel
7b2d64f933 selftests: forwarding: Remove IPv6 L3 multipath hash tests
The multipath tests currently test both the L3 and L4 multipath hash
policies for IPv6, but only the L4 policy for IPv4. The reason is mostly
historic: When the initial multipath test was added
(router_multipath.sh) the IPv6 L4 policy did not exist and was later
added to the test. The other multipath tests copied this pattern
although there is little value in testing both policies.

Align the IPv4 and IPv6 tests and only test the L4 policy. On my system,
this reduces the run time of router_multipath.sh by 89% because of the
repeated ping6 invocations to randomize the flow label.

Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Link: https://lore.kernel.org/r/20240304095612.462900-2-idosch@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-03-05 09:18:00 -08:00
Eric Dumazet
00af2aa93b net/smc: reduce rtnl pressure in smc_pnet_create_pnetids_list()
Many syzbot reports show extreme rtnl pressure, and many of them hint
that smc acquires rtnl in netns creation for no good reason [1]

This patch returns early from smc_pnet_net_init()
if there is no netdevice yet.

I am not even sure why smc_pnet_create_pnetids_list() even exists,
because smc_pnet_netdev_event() is also calling
smc_pnet_add_base_pnetid() when handling NETDEV_UP event.

[1] extract of typical syzbot reports

2 locks held by syz-executor.3/12252:
  #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline]
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878
2 locks held by syz-executor.4/12253:
  #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline]
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878
2 locks held by syz-executor.1/12257:
  #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline]
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878
2 locks held by syz-executor.2/12261:
  #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline]
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878
2 locks held by syz-executor.0/12265:
  #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline]
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878
2 locks held by syz-executor.3/12268:
  #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline]
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878
2 locks held by syz-executor.4/12271:
  #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline]
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878
2 locks held by syz-executor.1/12274:
  #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline]
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878
2 locks held by syz-executor.2/12280:
  #0: ffffffff8f369610 (pernet_ops_rwsem){++++}-{3:3}, at: copy_net_ns+0x4c7/0x7b0 net/core/net_namespace.c:491
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_create_pnetids_list net/smc/smc_pnet.c:809 [inline]
  #1: ffffffff8f375b88 (rtnl_mutex){+.+.}-{3:3}, at: smc_pnet_net_init+0x10a/0x1e0 net/smc/smc_pnet.c:878

Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wenjia Zhang <wenjia@linux.ibm.com>
Cc: Jan Karcher <jaka@linux.ibm.com>
Cc: "D. Wythe" <alibuda@linux.alibaba.com>
Cc: Tony Lu <tonylu@linux.alibaba.com>
Cc: Wen Gu <guwen@linux.alibaba.com>
Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
Link: https://lore.kernel.org/r/20240302100744.3868021-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-05 15:49:35 +01:00
Paolo Abeni
eead059950 linux-can-next-for-6.9-20240304
-----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCgAxFiEEUEC6huC2BN0pvD5fKDiiPnotvG8FAmXlkC8THG1rbEBwZW5n
 dXRyb25peC5kZQAKCRAoOKI+ei28b1tVB/9S9z5OTWJipe5PqY3z2zodTi7KzUXA
 Y0svHGDJN86P7COGKdAGDhhXmzTZ05uxhHw2TJyim3C67HDHZwVRc/AmLa+SQB8S
 TZP9uJEqE3O18rYGjhkE0Fni3viJjB+1WinoElRrmfQO+3u480xzyv09nIImY6kf
 WIk9PcQvWyf/nWr/D60BVe9Ifw4VSID+jsoppQTgE6XK3NI4IFjXt93W1Ln0E8u0
 ld5vqEjanwevljBcIGPaI3WrHOlwfTAQhVDMXvvEr5NJ5AnvC2BkXkEj6AXnT1mj
 8xaIi/e3rAfYBTypZ2EY1ZGKDslTMVAuztDhPqmoxommsw3ldoQi6Yz3
 =JobK
 -----END PGP SIGNATURE-----

Merge tag 'linux-can-next-for-6.9-20240304' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next

Marc Kleine-Budde says:

====================
pull-request: can-next 2024-03-04

this is a pull request of 4 patches for net-next/master.

The 1st patch is by Jimmy Assarsson and adds support for the Leaf v3
to the kvaser_usb driver.

Martin Jocić's patch targets the kvaser_pciefd driver and adds support
for the Kvaser PCIe 8xCAN device.

Followed by a patch by me that adds a missing a cpu_to_le32() to the
gs_usb driver, the change is not critical as the assigned value is 0.

The last patch is also by me and replaces a literal 256 with a proper
define.

linux-can-next-for-6.9-20240304

* tag 'linux-can-next-for-6.9-20240304' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can-next:
  can: mcp251xfd: __mcp251xfd_get_berr_counter(): use CAN_BUS_OFF_THRESHOLD instead of open coding it
  can: gs_usb: gs_cmd_reset(): use cpu_to_le32() to assign mode
  can: kvaser_pciefd: Add support for Kvaser PCIe 8xCAN
  can: kvaser_usb: Add support for Leaf v3
====================

Link: https://lore.kernel.org/r/20240304092051.3631481-1-mkl@pengutronix.de
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-05 13:57:31 +01:00
Abhishek Chauhan
885c36e59f net: Re-use and set mono_delivery_time bit for userspace tstamp packets
Bridge driver today has no support to forward the userspace timestamp
packets and ends up resetting the timestamp. ETF qdisc checks the
packet coming from userspace and encounters to be 0 thereby dropping
time sensitive packets. These changes will allow userspace timestamps
packets to be forwarded from the bridge to NIC drivers.

Setting the same bit (mono_delivery_time) to avoid dropping of
userspace tstamp packets in the forwarding path.

Existing functionality of mono_delivery_time remains unaltered here,
instead just extended with userspace tstamp support for bridge
forwarding path.

Signed-off-by: Abhishek Chauhan <quic_abchauha@quicinc.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Link: https://lore.kernel.org/r/20240301201348.2815102-1-quic_abchauha@quicinc.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-05 13:41:16 +01:00
Paolo Abeni
d35c9659e5 Merge branch 'net-gro-cleanups-and-fast-path-refinement'
Eric Dumazet says:

====================
net: gro: cleanups and fast path refinement

Current GRO stack has a 'fast path' for a subset of drivers,
users of napi_frags_skb().

With TCP zerocopy/direct uses, header split at receive is becoming
more important, and GRO fast path is disabled.

This series makes GRO (a bit) more efficient for almost all use cases.
====================

Link: https://lore.kernel.org/r/20240301193740.3436871-1-edumazet@google.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-05 13:30:16 +01:00
Eric Dumazet
8f78010b70 tcp: gro: micro optimizations in tcp[4]_gro_complete()
In tcp_gro_complete() :

Moving the skb->inner_transport_header setting
allows the compiler to reuse the previously loaded value
of skb->transport_header.

Caching skb_shinfo() avoids duplications as well.

In tcp4_gro_complete(), doing a single change on
skb_shinfo(skb)->gso_type also generates better code.

Signed-off-by: Eric Dumazet <edumazet@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-03-05 13:30:11 +01:00