Commit Graph

1311478 Commits

Author SHA1 Message Date
Dr. David Alan Gilbert
d3e80070b5 sfc: Remove more unused functions
efx_ticks_to_usecs(), efx_reconfigure_port(), efx_ptp_get_mode(), and
efx_tx_get_copy_buffer_limited() are unused.
They seem to be partially due to the later splits to Siena, but
some seem unused for longer.

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/20241102151625.39535-5-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:35:11 -08:00
Dr. David Alan Gilbert
5254fdfc74 sfc: Remove unused mcdi functions
efx_mcdi_flush_rxqs(), efx_mcdi_rpc_async_quiet(),
efx_mcdi_rpc_finish_quiet(), and efx_mcdi_wol_filter_get_magic()
are unused.
I think these are fall out from the split into Siena
that happened in
commit 4d49e5cd4b ("sfc/siena: Rename functions in mcdi headers to avoid
conflicts with sfc")
and
commit d48523cb88 ("sfc: Copy shared files needed for Siena (part 2)")

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/20241102151625.39535-4-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:35:11 -08:00
Dr. David Alan Gilbert
70e58249a6 sfc: Remove unused efx_mae_mport_vf
efx_mae_mport_vf() has been unused since
commit 5227adff37 ("sfc: add mport lookup based on driver's mport data")

Remove it.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/20241102151625.39535-3-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:35:11 -08:00
Dr. David Alan Gilbert
cc4914d904 sfc: Remove falcon deadcode
ef4_farch_dimension_resources(), ef4_nic_fix_nodesc_drop_stat(),
ef4_ticks_to_usecs() and ef4_tx_get_copy_buffer_limited() were
copied over from efx_ equivalents in 2016 but never used by
commit 5a6681e22c ("sfc: separate out SFC4000 ("Falcon") support into new
sfc-falcon driver")

EF4_MAX_FLUSH_TIME is also unused.

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Acked-by: Martin Habets <habetsm.xilinx@gmail.com>
Link: https://patch.msgid.link/20241102151625.39535-2-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:35:11 -08:00
Maurice Lambert
84bfbfbbd3 netlink: typographical error in nlmsg_type constants definition
This commit fix a typographical error in netlink nlmsg_type constants definition in the include/uapi/linux/rtnetlink.h at line 177. The definition is RTM_NEWNVLAN RTM_NEWVLAN instead of RTM_NEWVLAN RTM_NEWVLAN.

Signed-off-by: Maurice Lambert <mauricelambert434@gmail.com>
Fixes: 8dcea18708 ("net: bridge: vlan: add rtm definitions and dump support")
Link: https://patch.msgid.link/20241103223950.230300-1-mauricelambert434@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:33:55 -08:00
Vadim Fedorenko
6c0828d00f bnxt_en: replace PTP spinlock with seqlock
We can see high contention on ptp_lock while doing RX timestamping
on high packet rates over several queues. Spinlock is not effecient
to protect timecounter for RX timestamps when reads are the most
usual operations and writes are only occasional. It's better to use
seqlock in such cases.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://patch.msgid.link/20241103215108.557531-2-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:33:26 -08:00
Vadim Fedorenko
bb2ef9b92b bnxt_en: cache only 24 bits of hw counter
This hardware can provide only 48 bits of cycle counter. We can leave
only 24 bits in the cache to extend RX timestamps from 32 bits to 48
bits. Lower 8 bits of the cached value will be used to check for
roll-over while extending to full 48 bits.
This change makes cache writes atomic even on 32 bit platforms and we
can simply use READ_ONCE()/WRITE_ONCE() pair and remove spinlock. The
configuration structure will be also reduced by 4 bytes.

Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Link: https://patch.msgid.link/20241103215108.557531-1-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 17:33:26 -08:00
Matthieu Baerts (NGI0)
f72aa1b276 selftests: net: include lib/sh/*.sh with lib.sh
Recently, the net/lib.sh file has been modified to include defer.sh from
net/lib/sh/ directory. The Makefile from net/lib has been modified
accordingly, but not the ones from the sub-targets using net/lib.sh.

Because of that, the new file is not installed as expected when
installing the Forwarding, MPTCP, and Netfilter targets, e.g.

  # make -C tools/testing/selftests TARGETS=net/mptcp install \
        INSTALL_PATH=/tmp/kself
  # cd /tmp/kself/
  # ./run_kselftest.sh -c net/mptcp
    TAP version 13
    1..7
    # timeout set to 1800
    # selftests: net/mptcp: mptcp_connect.sh
    # ./../lib.sh: line 5: /tmp/kself/net/lib/sh/defer.sh: No such file
      or directory
    # (...)

This can be fixed simply by adding all the .sh files from net/lib/sh
directory to the TEST_INCLUDES variable in the different Makefile's.

Fixes: a6e263f125 ("selftests: net: lib: Introduce deferred commands")
Signed-off-by: Matthieu Baerts (NGI0) <matttbe@kernel.org>
Reviewed-by: Petr Machata <petrm@nvidia.com>
Link: https://patch.msgid.link/20241104-net-next-selftests-lib-sh-deps-v1-1-7c9f7d939fc2@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 16:46:39 -08:00
Vadim Fedorenko
0452a2d8b8 mlx5_en: use read sequence for gettimex64
The gettimex64() doesn't modify values in timecounter, that's why there
is no need to update sequence counter. Reduce the contention on sequence
lock for multi-thread PHC reading use-case.

Signed-off-by: Vadim Fedorenko <vadfed@meta.com>
Reviewed-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Acked-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241014170103.2473580-1-vadfed@meta.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-05 15:47:14 -08:00
Paolo Abeni
ccb35037c4 Merge branch 'net-lan969x-add-vcap-functionality'
Daniel Machon says:

====================
net: lan969x: add VCAP functionality

== Description:

This series is the third of a multi-part series, that prepares and adds
support for the new lan969x switch driver.

The upstreaming efforts is split into multiple series (might change a
bit as we go along):

        1) Prepare the Sparx5 driver for lan969x (merged)

        2) Add support for lan969x (same basic features as Sparx5
           provides excl. FDMA and VCAP, merged).

    --> 3) Add lan969x VCAP functionality.

        4) Add RGMII and FDMA functionality.

== VCAP support:

The Versatile Content-Aware Processor (VCAP) is a content-aware packet
processor that allows wirespeed packet inspection for rich
implementation of, for example, advanced VLAN and QoS classification and
manipulations, IP source guarding, longest prefix matching for Layer-3
routing, and security features for wireline and wireless applications.
This is all achieved by programming rules into the VCAP.

When a VCAP is enabled, every frame passing through the switch is
analyzed and multiple keys are created based on the contents of the
frame. The frame is examined to determine the frame type (for example,
IPv4 TCP frame), so that the frame information is extracted according to
the frame type, port-specific configuration, and classification results
from the basic classification. Keys are applied to the VCAP and when
there is a match between a key and a rule in the VCAP, the rule is then
applied to the frame from which the key was extracted.

After this series is applied, the lan969x driver will support the same
VCAP functionality as Sparx5.

== Patch breakdown:

Patch #1 exposes some VCAP symbols for lan969x.

Patch #2 replaces VCAP uses of SPX5_PORTS with n_ports from the match
data.

Patch #3 adds new VCAP constants to match data

Patch #4 removes the is_sparx5() check to now initialize the VCAP API on
lan969x.

Patch #5 adds the auto-generated VCAP data for lan969x.

Patch #6 adds the VCAP configuration data for lan969x.

Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
====================

Link: https://patch.msgid.link/20241101-sparx5-lan969x-switch-driver-3-v1-0-3c76f22f4bfa@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 13:31:10 +01:00
Daniel Machon
1091487dc7 net: lan969x: add VCAP configuration data
Add configuration data (for consumption by the VCAP API) for the four
VCAP's that we are going to support. The following VCAP's will be
supported:

 - VCAP CLM: (also known as IS0) is part of the analyzer and enables
   frame classification using VCAP functionality.

 - VCAP IS2: is part of ANA_ACL and enables access control lists, using
   VCAP functionality.

 - VCAP ES0: is part of the rewriter and enables rewriting of frames
   using VCAP functionality.

 - VCAP ES2: is part of EACL and enables egress access control lists
   using VCAP functionality

The two VCAP's: CLM and IS2 use shared resources from the SUPER VCAP.
The SUPER VCAP is a shared pool of 6 blocks that can be distributed
freely among CLM and IS2.  Each block in the pool has 3,072 addresses
with entries, actions, and counters. ES0 and ES2 does not use shared
resources.

In the configuration data for lan969x CLM uses blocks 2-4 with a total
of 6 lookups. IS2 uses blocks 0-1 with a total of 4 lookups.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 13:31:08 +01:00
Daniel Machon
7ef750e490 net: lan969x: add autogenerated VCAP information
Platform VCAP data for each VCAP instance is auto-generated using an
internal Microchip tool. The generated VCAP data contains information
about keyfields, keyfield sets, actionfields, actionfield sets and
typegroups, which in combination are used to encode and decode rules in
the VCAP.

Add the auto-generated VCAP file lan969x_vcap_ag_api.c and assign the
two structs: lan969x_vcaps and lan969x_vcap_stats to the match data.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 13:31:08 +01:00
Daniel Machon
d4c97e39bf net: sparx5: execute sparx5_vcap_init() on lan969x
The is_sparx5() check was introduced in an earlier series, to make sure
the sparx5_vcap_init() was not executed on lan969x, as it was not
implemented there yet. Now that it is, remove that check.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 13:31:08 +01:00
Daniel Machon
8caa21e4e4 net: sparx5: add new VCAP constants to match data
In preparation for lan969x VCAP support, add the following three new
VCAP constants to match data:

    - vcaps_cfg (contains configuration data for each VCAP).

    - vcaps (contains auto-generated information about VCAP keys and
      actions).

    - vcap_stats: (contains auto-generated string names of all the keys
      and actions)

Add these constants to the Sparx5 match data constants and use them to
initialize the VCAP's in sparx5_vcap_init().

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 13:31:08 +01:00
Daniel Machon
8f5a812eff net: sparx5: replace SPX5_PORTS with n_ports
The Sparx5 VCAP implementation uses the SPX5_PORTS symbol to iterate over
the 65 front ports of Sparx5. Replace the use with the n_ports constant
from the match data, which translates to 65 of Sparx5 and 30 on lan969x.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 13:31:08 +01:00
Daniel Machon
9bdb67b53f net: sparx5: expose some sparx5 VCAP symbols
In preparation for lan969x VCAP support, expose the following symbols for
use by the lan969x VCAP implementation:

- The symbols SPARX5_*_LOOKUPS defines the number of lookups in each
  VCAP instance.  These are the same for lan969x. Move them to the
  header file.

- The struct sparx5_vcap_inst encapsulates information about a single
  VCAP instance. Move this struct to the header file and declare the
  sparx5_vcap_inst_cfg as extern.

Reviewed-by: Steen Hegelund <Steen.Hegelund@microchip.com>
Reviewed-by: Jens Emil Schulz Østergaard <jensemil.schulzostergaard@microchip.com>
Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 13:31:08 +01:00
Paolo Abeni
7af3a6558c Merge branch 'virtio_net-enable-premapped-mode-by-default'
Xuan Zhuo says:

====================
virtio_net: enable premapped mode by default

v1:
    1. fix some small problems
    2. remove commit "virtio_net: introduce vi->mode"

In the last linux version, we disabled this feature to fix the
regress[1].

The patch set is try to fix the problem and re-enable it.

More info: http://lore.kernel.org/all/20240820071913.68004-1-xuanzhuo@linux.alibaba.com

[1]: http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com
====================

Link: https://patch.msgid.link/20241029084615.91049-1-xuanzhuo@linux.alibaba.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 11:39:26 +01:00
Xuan Zhuo
fb22437c1b virtio_net: rx remove premapped failover code
Now, the premapped mode can be enabled unconditionally.

So we can remove the failover code for merge and small mode.

The virtnet_rq_xxx() helper would be only used if the mode is using pre
mapping. A check is added to prevent misusing of these API.

Tested-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 11:37:41 +01:00
Xuan Zhuo
47008bb51c virtio_net: enable premapped mode for merge and small by default
Currently, the virtio core will perform a dma operation for each
buffer. Although, the same page may be operated multiple times.

In premapped mod, we can perform only one dma operation for the pages of
the alloc frag. This is beneficial for the iommu device.

kernel command line: intel_iommu=on iommu.passthrough=0

       |  strict=0  | strict=1
Before |  775496pps | 428614pps
After  | 1109316pps | 742853pps

In the 6.11, we disabled this feature because a regress [1].

Now, we fix the problem and re-enable it.

[1]: http://lore.kernel.org/all/8b20cc28-45a9-4643-8e87-ba164a540c0a@oracle.com

Tested-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 11:37:40 +01:00
Xuan Zhuo
a33f3df850 virtio_net: big mode skip the unmap check
The virtio-net big mode did not enable premapped mode,
so we did not need to check the unmap. And the subsequent
commit will remove the failover code for failing enable
premapped for merge and small mode. So we need to remove
the checking do_dma code in the big mode path.

Tested-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 11:37:40 +01:00
Xuan Zhuo
6aacd14844 virtio-net: fix overflow inside virtnet_rq_alloc
When the frag just got a page, then may lead to regression on VM.
Specially if the sysctl net.core.high_order_alloc_disable value is 1,
then the frag always get a page when do refill.

Which could see reliable crashes or scp failure (scp a file 100M in size
to VM).

The issue is that the virtnet_rq_dma takes up 16 bytes at the beginning
of a new frag. When the frag size is larger than PAGE_SIZE,
everything is fine. However, if the frag is only one page and the
total size of the buffer and virtnet_rq_dma is larger than one page, an
overflow may occur.

The commit f9dac92ba9 ("virtio_ring: enable premapped mode whatever
use_dma_api") introduced this problem. And we reverted some commits to
fix this in last linux version. Now we try to enable it and fix this
bug directly.

Here, when the frag size is not enough, we reduce the buffer len to fix
this problem.

Reported-by: "Si-Wei Liu" <si-wei.liu@oracle.com>
Tested-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Xuan Zhuo <xuanzhuo@linux.alibaba.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-05 11:37:40 +01:00
Jakub Kicinski
c688a96c43 Merge branch 'fix-sparse-warnings-in-dpaa_eth-driver'
Vladimir Oltean says:

====================
Fix sparse warnings in dpaa_eth driver

This is a follow-up of the discussion at:
https://lore.kernel.org/oe-kbuild-all/20241028-sticky-refined-lionfish-b06c0c@leitao/
where I said I would take care of the sparse warnings uncovered by
Breno's COMPILE_TEST change for the dpaa_eth driver.

There was one warning that I decided to treat as an actual bug:
https://lore.kernel.org/netdev/20241029163105.44135-1-vladimir.oltean@nxp.com/
and what remains here are those warnings which I consider harmless.

I would like Christophe to ack the entire series to be taken through
netdev. I find it weird that the qbman driver, whose major API consumer
is netdev, is maintained by a different group. In this case, the buggy
qm_sg_entry_get_off() function is defined in qbman but exclusively
called in netdev.
====================

Link: https://patch.msgid.link/20241029164317.50182-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04 18:44:45 -08:00
Vladimir Oltean
0a746cf8bb net: dpaa_eth: extract hash using __be32 pointer in rx_default_dqrr()
Sparse provides the following output:

warning: cast to restricted __be32

This is a harmless warning due to the fact that we dereference the hash
stored in the FD using an incorrect type annotation. Suppress the
warning by using the correct __be32 type instead of u32. No functional
change.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20241029164317.50182-4-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04 18:44:43 -08:00
Vladimir Oltean
81f8ee2823 net: dpaa_eth: add assertions about SGT entry offsets in sg_fd_to_skb()
Multi-buffer frame descriptors (FDs) point to a buffer holding a
scatter/gather table (SGT), which is a finite array of fixed-size
entries, the last of which has qm_sg_entry_is_final(&sgt[i]) == true.

Each SGT entry points to a buffer holding pieces of the frame.
DPAARM.pdf explains in the figure called "Internal and External Margins,
Scatter/Gather Frame Format" that the SGT table is located within its
buffer at the same offset as the frame data start is located within the
first packet buffer.

                                 +------------------------+
    Scatter/Gather Buffer        |        First Buffer    |   Last Buffer
      ^ +------------+ ^       +-|---->^ +------------+   +->+------------+
      | |            | | ICEOF | |     | |            |      |////////////|
      | +------------+ v       | |     | |            |      |////////////|
 BSM  | |/ part of //|         | |BSM  | |            |      |////////////|
      | |/ Internal /|         | |     | |            |      |////////////|
      | |/ Context //|         | |     | |            |      |// Frame ///|
      | +------------+         | |     | |            | ...  |/ content //|
      | |            |         | |     | |            |      |////////////|
      | |            |         | |     | |            |      |////////////|
      v +------------+         | |     v +------------+      |////////////|
        | Scatter/ //| sgt[0]--+ |       |// Frame ///|      |////////////|
        | Gather List| ...       |       |/ content //|      +------------+ ^
        |////////////| sgt[N]----+       |////////////|      |            | | BEM
        |////////////|                   |////////////|      |            | |
        +------------+                   +------------+      +------------+ v

BSM = Buffer Start Margin, BEM = Buffer End Margin, both are configured
by dpaa_eth_init_rx_port() for the RX FMan port relevant here.

sg_fd_to_skb() runs in the calling context of rx_default_dqrr() -
the NAPI receive callback - which only expects to receive contiguous
(qm_fd_contig) or scatter/gather (qm_fd_sg) frame descriptors.
Everything else is irrelevant codewise.

The processing done by sg_fd_to_skb() is weird because it does not
conform to the expectations laid out by the aforementioned figure.
Namely, it parses the OFFSET field only for SGT entries with i != 0
(codewise, skb != NULL). In those cases, OFFSET should always be 0.
Also, it does not parse the OFFSET field for the sgt[0] case, the only
case where the buffer offset is meaningful in this context. There, it
uses the fd_off, aka the offset to the Scatter/Gather List in the
Scatter/Gather Buffer from the figure. By equivalence, they should both
be equal to the BSM (in turn, equal to priv->rx_headroom).

This can actually be explained due to the bug which we had in
qm_sg_entry_get_off() until the previous change:

- qm_sg_entry_get_off() did not actually _work_ for sgt[0]. It returned
  zero even with a non-zero offset, so fd_off had to be used as a fill-in.

- qm_sg_entry_get_off() always returned zero for sgt[i>0], and that
  resulted in no user-visible bug, because the buffer offset _was
  supposed_ to be zero for those buffers. So remove it from calculations.

Add assertions about the OFFSET field in both cases (first or subsequent
SGT entries) to make it absolutely obvious when something is not well
handled.

Similar logic can be seen in the driver for the architecturally similar
DPAA2, where dpaa2_eth_build_frag_skb() calls dpaa2_sg_get_offset() only
for i == 0. For the rest, there is even a comment stating the same thing:

	 * Data in subsequent SG entries is stored from the
	 * beginning of the buffer, so we don't need to add the
	 * sg_offset.

Tested on LS1046A.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20241029164317.50182-3-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04 18:44:43 -08:00
Vladimir Oltean
a12fcef429 soc: fsl_qbman: use be16_to_cpu() in qm_sg_entry_get_off()
struct qm_sg_entry :: offset is a 13-bit field, declared as __be16.

When using be32_to_cpu(), a wrong value will be calculated on little
endian systems (Arm), because type promotion from 16-bit to 32-bit,
which is done before the byte swap and always in the CPU native
endianness, changes the value of the scatter/gather list entry offset in
big-endian interpretation (adds two zero bytes in the LSB interpretation).
The result of the byte swap is ANDed with GENMASK(12, 0), so the result
is always zero, because only those bytes added by type promotion remain
after the application of the bit mask.

The impact of the bug is that scatter/gather frames with a non-zero
offset into the buffer are treated by the driver as if they had a zero
offset. This is all in theory, because in practice, qm_sg_entry_get_off()
has a single caller, where the bug is inconsequential, because at that
call site the buffer offset will always be zero, as will be explained in
the subsequent change.

Flagged by sparse:

warning: cast to restricted __be32
warning: cast from restricted __be16

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Breno Leitao <leitao@debian.org>
Acked-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Acked-by: Madalin Bucur <madalin.bucur@oss.nxp.com>
Link: https://patch.msgid.link/20241029164317.50182-2-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04 18:44:43 -08:00
Rosen Penev
d2068805f6 net: ena: remove devm from ethtool
There's no need for devm bloat here. In addition, these are freed right
before the function exits.

Also swapped kcalloc order for consistency.

Signed-off-by: Rosen Penev <rosenp@gmail.com>
Reviewed-by: Shay Agroskin <shayagr@amazon.com>
Link: https://patch.msgid.link/20241101214828.289752-2-rosenp@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04 18:21:52 -08:00
David Woodhouse
18ec5491a4 ptp: Remove 'default y' for VMCLOCK PTP device
The VMCLOCK device gives support for accurate timekeeping even across
live migration, unlike the KVM PTP clock. To help ensure that users can
always use ptp_vmclock where it's available in preference to ptp_kvm,
set it to 'default PTP_1588_CLOCK_VMCLOCK' instead of 'default y'.

Signed-off-by: David Woodhouse <dwmw@amazon.co.uk>
Link: https://patch.msgid.link/89955b74d225129d6e3d79b53aa8d81d1b50560f.camel@infradead.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04 18:18:10 -08:00
Dr. David Alan Gilbert
6a7d68f727 net: ena: Remove deadcode
ena_com_get_dev_basic_stats() has been unused since 2017's
commit d81db24056 ("net/ena: refactor ena_get_stats64 to be atomic
context safe")

ena_com_get_offload_settings() has been unused since the original
commit of ENA back in 2016 in
commit 1738cd3ed3 ("net: ena: Add a driver for Amazon Elastic
Network Adapters (ENA)")

Remove them.

Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Reviewed-by: David Arinzon <darinzon@amazon.com>
Link: https://patch.msgid.link/20241102220142.80285-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04 18:17:37 -08:00
Dr. David Alan Gilbert
b356b91708 net: ena: Remove autopolling mode
This manually reverts
commit a4e262cde3 ("net: ena: allow automatic fallback to polling mode")

which is unused.

(I did it manually because there are other minor comment
and function changes surrounding it).
Build tested only.

Suggested-by: David Arinzon <darinzon@amazon.com>
Signed-off-by: Dr. David Alan Gilbert <linux@treblig.org>
Link: https://patch.msgid.link/20241103194149.293456-1-linux@treblig.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04 18:12:56 -08:00
Jakub Kicinski
690e50dd69 tools: ynl-gen: de-kdocify enums with no doc for entries
Sometimes the names of the enum entries are self-explanatory
or come from standards. Forcing authors to write trivial kdoc
for each of such entries seems unreasonable, but kdoc would
complain about undocumented entries.

Detect enums which only have documentation for the entire
type and no documentation for entries. Render their doc
as a plain comment.

Link: https://patch.msgid.link/20241103165314.1631237-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04 18:11:47 -08:00
Menglong Dong
0a2cdeeae9 net: tcp: replace the document for "lsndtime" in tcp_sock
Commit d5fed5addb ("tcp: reorganize tcp_sock fast path variables")
moved the fields around and misplaced the documentation for "lsndtime".
So, let's replace it in the proper place.

Signed-off-by: Menglong Dong <dongml2@chinatelecom.cn>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241104070041.64302-1-dongml2@chinatelecom.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-04 18:10:56 -08:00
David S. Miller
ecf99864ea Merge branch 'mx95-netc-support'
Wei Fang says:

====================
net: add basic support for i.MX95 NETC

This is first time that the NETC IP is applied on i.MX MPU platform.
Its revision has been upgraded to 4.1, which is very different from
the NETC of LS1028A (its revision is 1.0). Therefore, some existing
drivers of NETC devices in the Linux kernel are not compatible with
the current hardware. For example, the fsl-enetc driver is used to
drive the ENETC PF of LS1028A, but for i.MX95 ENETC PF, its registers
and tables configuration are very different from those of LS1028A,
and only the station interface (SI) part remains basically the same.
For the SI part, Vladimir has separated the fsl-enetc-core driver, so
we can reuse this driver on i.MX95. However, for other parts of PF,
the fsl-enetc driver cannot be reused, so the nxp-enetc4 driver is
added to support revision 4.1 and later.

During the development process, we found that the two PF drivers have
some interfaces with basically the same logic, and the only difference
is the hardware configuration. So in order to reuse these interfaces
and reduce code redundancy, we extracted these interfaces and compiled
them into a separate nxp-enetc-pf-common driver for use by the two PF
drivers.

In addition, we have developed the nxp-netc-blk-ctrl driver, which
is used to control three blocks, namely Integrated Endpoint Register
Block (IERB), Privileged Register Block (PRB) and NETCMIX block. The
IERB contains registers that are used for pre-boot initialization,
debug, and non-customer configuration. The PRB controls global reset
and global error handling for NETC. The NETCMIX block is mainly used
to set MII protocol and PCS protocol of the links, it also contains
settings for some other functions.

---
v1 Link: https://lore.kernel.org/imx/20241009095116.147412-1-wei.fang@nxp.com/
v2 Link: https://lore.kernel.org/imx/20241015125841.1075560-1-wei.fang@nxp.com/
v3 Link: https://lore.kernel.org/imx/20241017074637.1265584-1-wei.fang@nxp.com/
v4 Link: https://lore.kernel.org/imx/20241022055223.382277-1-wei.fang@nxp.com/
v5 Link: https://lore.kernel.org/imx/20241024065328.521518-1-wei.fang@nxp.com/
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:52 +00:00
Wei Fang
f488649e40 MAINTAINERS: update ENETC driver files and maintainers
Add related YAML documentation and header files. Also, add maintainers
from the i.MX side as ENETC starts to be used on i.MX platforms.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:51 +00:00
Wei Fang
99100d0d99 net: enetc: add preliminary support for i.MX95 ENETC PF
The i.MX95 ENETC has been upgraded to revision 4.1, which is different
from the LS1028A ENETC (revision 1.0) except for the SI part. Therefore,
the fsl-enetc driver is incompatible with i.MX95 ENETC PF. So add new
nxp-enetc4 driver to support i.MX95 ENETC PF, and this driver will be
used to support the ENETC PF with major revision 4 for other SoCs in the
future.

Currently, the nxp-enetc4 driver only supports basic transmission feature
for i.MX95 ENETC PF, the more basic and advanced features will be added
in the subsequent patches. In addition, PCS support has not been added
yet, so 10G ENETC (ENETC instance 2) is not supported now.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:51 +00:00
Clark Wang
9e7f211619 net: enetc: optimize the allocation of tx_bdr
There is a situation where num_tx_rings cannot be divided by bdr_int_num.
For example, num_tx_rings is 8 and bdr_int_num is 3. According to the
previous logic, this results in two tx_bdr corresponding memories not
being allocated, so when sending packets to tx ring 6 or 7, wild pointers
will be accessed. Of course, this issue doesn't exist on LS1028A, because
its num_tx_rings is 8, and bdr_int_num is either 1 or 2. However, there
is a risk for the upcoming i.MX95. Therefore, it is necessary to ensure
that each tx_bdr can be allocated to the corresponding memory.

Signed-off-by: Clark Wang <xiaoning.wang@nxp.com>
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Claudiu Manoil <claudiu.manoil@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:51 +00:00
Clark Wang
b4bfd0a904 net: enetc: extract enetc_int_vector_init/destroy() from enetc_alloc_msix()
Extract enetc_int_vector_init() and enetc_int_vector_destroy() from
enetc_alloc_msix() so that the code is more concise and readable.

Signed-off-by: Clark Wang <xiaoning.wang@nxp.com>
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:51 +00:00
Wei Fang
a52201fb9c net: enetc: add i.MX95 EMDIO support
The verdor ID and device ID of i.MX95 EMDIO are different from LS1028A
EMDIO, so add new vendor ID and device ID to pci_device_id table to
support i.MX95 EMDIO.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:51 +00:00
Vladimir Oltean
86831a3f4c net: enetc: remove ERR050089 workaround for i.MX95
The ERR050089 workaround causes performance degradation and potential
functional issues (e.g., RCU stalls) under certain workloads. Since
new SoCs like i.MX95 do not require this workaround, use a static key
to compile out enetc_lock_mdio() and enetc_unlock_mdio() at runtime,
improving performance and avoiding unnecessary logic.

Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:51 +00:00
Wei Fang
3774409fd4 net: enetc: build enetc_pf_common.c as a separate module
Compile enetc_pf_common.c as a standalone module to allow shared usage
between ENETC v1 and v4 PF drivers. Add struct enetc_pf_ops to register
different hardware operation interfaces for both ENETC v1 and v4 PF
drivers.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:51 +00:00
Wei Fang
80c8c85261 net: enetc: extract common ENETC PF parts for LS1028A and i.MX95 platforms
The ENETC PF driver of LS1028A (rev 1.0) is incompatible with the version
used on the i.MX95 platform (rev 4.1), except for the station interface
(SI) part. To reduce code redundancy and prepare for a new driver for rev
4.1 and later, extract shared interfaces from enetc_pf.c and move them to
enetc_pf_common.c. This refactoring lays the groundwork for compiling
enetc_pf_common.c into a shared driver for both platforms' PF drivers.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:51 +00:00
Wei Fang
fe5ba6bf91 net: enetc: add initial netc-blk-ctrl driver support
The netc-blk-ctrl driver is used to configure Integrated Endpoint
Register Block (IERB) and Privileged Register Block (PRB) of NETC.
For i.MX platforms, it is also used to configure the NETCMIX block.

The IERB contains registers that are used for pre-boot initialization,
debug, and non-customer configuration. The PRB controls global reset
and global error handling for NETC. The NETCMIX block is mainly used
to set MII protocol and PCS protocol of the links, it also contains
settings for some other functions.

Note the IERB configuration registers can only be written after being
unlocked by PRB, otherwise, all write operations are inhibited. A warm
reset is performed when the IERB is unlocked, and it results in an FLR
to all NETC devices. Therefore, all NETC device drivers must be probed
or initialized after the warm reset is finished.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:50 +00:00
Wei Fang
f70384e53b dt-bindings: net: add bindings for NETC blocks control
Add bindings for NXP NETC blocks control. Usually, NETC has 2 blocks of
64KB registers, integrated endpoint register block (IERB) and privileged
register block (PRB). IERB is used for pre-boot initialization for all
NETC devices, such as ENETC, Timer, EMDIO and so on. And PRB controls
global reset and global error handling for NETC. Moreover, for the i.MX
platform, there is also a NETCMIX block for link configuration, such as
MII protocol, PCS protocol, etc.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:50 +00:00
Wei Fang
db2fb74c85 dt-bindings: net: add i.MX95 ENETC support
The ENETC of i.MX95 has been upgraded to revision 4.1, and the vendor
ID and device ID have also changed, so add the new compatible strings
for i.MX95 ENETC. In addition, i.MX95 supports configuration of RGMII
or RMII reference clock.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:50 +00:00
Wei Fang
da98dbbc2c dt-bindings: net: add compatible string for i.MX95 EMDIO
The EMDIO of i.MX95 has been upgraded to revision 4.1, and the vendor
ID and device ID have also changed, so add the new compatible strings
for i.MX95 EMDIO.

Signed-off-by: Wei Fang <wei.fang@nxp.com>
Reviewed-by: Frank Li <Frank.Li@nxp.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Rob Herring (Arm) <robh@kernel.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-04 10:03:50 +00:00
Jakub Kicinski
8d1807a95c Merge branch 'mlx5-misc-patches-2024-10-31'
Tariq Toukan says:

====================
mlx5 misc patches 2024-10-31

First patch by Cosmin fixes an issue in a recent commit.

Followed by 2 patches by Yevgeny that organize and rename the files
under the steering directory.

Finally, 2 patches by William that save the creation of the unused
egress-XDP_REDIRECT send queue on non-uplink representor.
====================

Link: https://patch.msgid.link/20241031125856.530927-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 15:37:17 -08:00
William Tu
355cf27497 net/mlx5e: do not create xdp_redirect for non-uplink rep
XDP and XDP socket require extra SQ/RQ/CQs. Most of these resources
are dynamically created: no XDP program loaded, no resources are
created. One exception is the SQ/CQ created for XDP_REDRIECT, used
for other netdev to forward packet to mlx5 for transmit. The patch
disables creation of SQ and CQ used for egress XDP_REDIRECT, by
checking whether ndo_xdp_xmit is set or not.

For netdev without XDP support such as non-uplink representor, this
saves around 0.35MB of memory, per representor netdevice per channel.

Signed-off-by: William Tu <witu@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241031125856.530927-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 15:37:15 -08:00
William Tu
bb135e4012 net/mlx5e: move XDP_REDIRECT sq to dynamic allocation
Dynamically allocating xdpsq, used by egress side XDP_REDIRECT.
mlx5 has multiple XDP sqs. Under struct mlx5e_channel:
1. rx_xdpsq: used for XDP_TX, an XDP prog handles the rx packet and
transmits using the same queue as rx.
2. xdpsq: used by egress side XDP_REDIRECT. This is for another interface
to redirect packet to the mlx5 interface, using ndo_xdp_xmit .
3. xsksq: used by XSK. XSK has its own dedicated channel, and it also
has resources of 1 and 2.

The patch changes only the 2. xdpsq.

Signed-off-by: William Tu <witu@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241031125856.530927-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 15:37:15 -08:00
Yevgeny Kliteynik
a2740138ec net/mlx5: HWS, renamed the files in accordance with naming convention
Removed the 'mlx5hws_' file name prefix from the internal HWS files.

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241031125856.530927-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 15:37:15 -08:00
Yevgeny Kliteynik
e03cf32188 net/mlx5: DR, moved all the SWS code into a separate directory
After adding HWS support in a separate folder, moving all the SWS
code into its own folder as well.
Now SWS and HWS implementation are located in their appropriate
folders:
 - steering/sws/
 - steering/hws/

Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241031125856.530927-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 15:37:15 -08:00
Cosmin Ratiu
cac7356c65 net/mlx5: Rework esw qos domain init and cleanup
The first approach was flawed, because there are situations where the
esw mode change fails, leaving the qos domain as NULL. Various calls
into the QoS infra then trigger a NULL pointer access and unhappiness.

Improve that by a combination of:
- Allocating the QoS domain on esw init and cleaning it up on teardown.
- Refactoring mode change to only call qos domain init but not cleanup.
- Making qos domain init idempotent - not change anything if nothing
  needs changing.

Together, these should guarantee that, as long as the memory allocations
succeed, there should always be a valid qos domain until the esw
cleanup, no matter what mode changes happen (or failures thereof).

Fixes: 107a034d5c ("net/mlx5: qos: Store rate groups in a qos domain")
Signed-off-by: Cosmin Ratiu <cratiu@nvidia.com>
Reviewed-by: Carolina Jubran <cjubran@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://patch.msgid.link/20241031125856.530927-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-11-03 15:37:14 -08:00