In mtk_flow_entry_update_l2, the hwe->ib1 is read using READ_ONCE at the
beginning of the function, checked, and then re-read from hwe->ib1,
may void all guarantees of the checks. Reuse the value that was read by
READ_ONCE to ensure the consistency of the ib1 throughout the function.
Signed-off-by: linke li <lilinke99@qq.com>
Link: https://lore.kernel.org/r/tencent_C699E9540505523424F11A9BD3D21B86840A@qq.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The definition and declaration of sja1110_pcs_mdio_write_c45() don't have
parameters in the same order.
Knowing that sja1110_pcs_mdio_write_c45() is used as a function pointer
in 'sja1105_info' structure with .pcs_mdio_write_c45, and that we have:
int (*pcs_mdio_write_c45)(struct mii_bus *bus, int phy, int mmd,
int reg, u16 val);
it is likely that the definition is the one to change.
Found with cppcheck, funcArgOrderDifferent.
Fixes: ae271547bb ("net: dsa: sja1105: C45 only transactions for PCS")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Michael Walle <mwalle@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/ff2a5af67361988b3581831f7bd1eddebfb4c48f.1712082763.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The error statistics should be updated each time the poll function is
called, even if the full RX work budget has been consumed. This prevents
the counts from becoming stuck when RX bandwidth usage is high.
This also ensures that error counters are not updated after we've
re-enabled interrupts as that could result in a race condition.
Also drop an unnecessary space.
Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/20240402145305.82148-2-paul.barker.ct@bp.renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
The TX queue should be serviced each time the poll function is called,
even if the full RX work budget has been consumed. This prevents
starvation of the TX queue when RX bandwidth usage is high.
Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/20240402145305.82148-1-paul.barker.ct@bp.renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Hook unregistration is deferred to the commit phase, same occurs with
hook updates triggered by the table dormant flag. When both commands are
combined, this results in deleting a basechain while leaving its hook
still registered in the core.
Fixes: 179d9ba555 ("netfilter: nf_tables: fix table flag updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
nft_unregister_flowtable_type() within nf_flow_inet_module_exit() can
concurrent with __nft_flowtable_type_get() within nf_tables_newflowtable().
And thhere is not any protection when iterate over nf_tables_flowtables
list in __nft_flowtable_type_get(). Therefore, there is pertential
data-race of nf_tables_flowtables list entry.
Use list_for_each_entry_rcu() to iterate over nf_tables_flowtables list
in __nft_flowtable_type_get(), and use rcu_read_lock() in the caller
nft_flowtable_type_get() to protect the entire type query process.
Fixes: 3b49e2e94e ("netfilter: nf_tables: add flow table netlink frontend")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
When dormant flag is toggled, hooks are disabled in the commit phase by
iterating over current chains in table (existing and new).
The following configuration allows for an inconsistent state:
add table x
add chain x y { type filter hook input priority 0; }
add table x { flags dormant; }
add chain x w { type filter hook input priority 1; }
which triggers the following warning when trying to unregister chain w
which is already unregistered.
[ 127.322252] WARNING: CPU: 7 PID: 1211 at net/netfilter/core.c:50 1 __nf_unregister_net_hook+0x21a/0x260
[...]
[ 127.322519] Call Trace:
[ 127.322521] <TASK>
[ 127.322524] ? __warn+0x9f/0x1a0
[ 127.322531] ? __nf_unregister_net_hook+0x21a/0x260
[ 127.322537] ? report_bug+0x1b1/0x1e0
[ 127.322545] ? handle_bug+0x3c/0x70
[ 127.322552] ? exc_invalid_op+0x17/0x40
[ 127.322556] ? asm_exc_invalid_op+0x1a/0x20
[ 127.322563] ? kasan_save_free_info+0x3b/0x60
[ 127.322570] ? __nf_unregister_net_hook+0x6a/0x260
[ 127.322577] ? __nf_unregister_net_hook+0x21a/0x260
[ 127.322583] ? __nf_unregister_net_hook+0x6a/0x260
[ 127.322590] ? __nf_tables_unregister_hook+0x8a/0xe0 [nf_tables]
[ 127.322655] nft_table_disable+0x75/0xf0 [nf_tables]
[ 127.322717] nf_tables_commit+0x2571/0x2620 [nf_tables]
Fixes: 179d9ba555 ("netfilter: nf_tables: fix table flag updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
The commit mutex should not be released during the critical section
between nft_gc_seq_begin() and nft_gc_seq_end(), otherwise, async GC
worker could collect expired objects and get the released commit lock
within the same GC sequence.
nf_tables_module_autoload() temporarily releases the mutex to load
module dependencies, then it goes back to replay the transaction again.
Move it at the end of the abort phase after nft_gc_seq_end() is called.
Cc: stable@vger.kernel.org
Fixes: 720344340f ("netfilter: nf_tables: GC transaction race with abort path")
Reported-by: Kuan-Ting Chen <hexrabbit@devco.re>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
Unlike early commit path stage which triggers a call to abort, an
explicit release of the batch is required on abort, otherwise mutex is
released and commit_list remains in place.
Add WARN_ON_ONCE to ensure commit_list is empty from the abort path
before releasing the mutex.
After this patch, commit_list is always assumed to be empty before
grabbing the mutex, therefore
03c1f1ef15 ("netfilter: Cleanup nft_net->module_list from nf_tables_exit_net()")
only needs to release the pending modules for registration.
Cc: stable@vger.kernel.org
Fixes: c0391b6ab8 ("netfilter: nf_tables: missing validation from the abort path")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
As of now, tg3_power_down_prepare always ends with success, but
the error handling code from former tg3_set_power_state call is still here.
This code became unreachable in commit c866b7eac0 ("tg3: Do not use
legacy PCI power management").
Remove (now unreachable) error handling code for simplification and change
tg3_power_down_prepare to a void function as its result is no more checked.
Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240401191418.361747-1-kiryushin@ancud.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
As of now, tg3_power_down_prepare always ends with success, but
the error handling code from former tg3_set_power_state call is still here.
This code became unreachable in commit c866b7eac0 ("tg3: Do not use
legacy PCI power management").
Remove (now unreachable) error handling code for simplification and change
tg3_power_down_prepare to a void function as its result is no more checked.
Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240401191418.361747-1-kiryushin@ancud.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
Petr Machata says:
====================
mlxsw: Preparations for improving performance
Amit Cohen writes:
mlxsw driver will use NAPI for event processing in a next patch set.
Some additional improvements will be added later. This patch set
prepares the code for NAPI usage and refactor some relevant areas. See
more details in commit messages.
Patch Set overview:
Patches #1-#2 are preparations for patch #3
Patch #3 setups tasklets as part of queue initializtion
Patch #4 removes handling of unlikely scenario
Patch #5 removes unused counters
Patch #6 makes style change in mlxsw_pci_eq_tasklet()
Patch #7-#10 poll command interface instead of EQ0 usage
Patches #11-#12 make style change and break the function
mlxsw_pci_cq_tasklet()
Patches #13-#14 remove functions which can be replaced by a stored value
Patch #15 improves accessing to descriptor queue instance
====================
Link: https://lore.kernel.org/r/cover.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, for each completion, we check the number of descriptor queue
and take it via mlxsw_pci_{sdq,rdq}_get(). This is inefficient, the
DQ should be the same for all the completions in CQ, as each CQ handles
only one DQ - SDQ or RDQ. This mapping is handled as part of DQ
initialization via mlxsw_cmd_mbox_sw2hw_dq_cq_set().
Instead, as part of DQ initialization, set DQ pointer in the appropriate
CQ structure. When we handle completions, warn in case that the DQ number
that we expect is different from the number we get in the CQE. Call
WARN_ON_ONCE() only after checking the value, to avoid calling this method
for each completion.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/a5b2559cd6d532c120f3194f89a1e257110318f1.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, for each interrupt we call mlxsw_pci_cq_count() to determine the
number of CQs. This call makes additional two function's calls. This can
be removed by storing this value as part of structure 'mlxsw_pci', as we
already do for number of SDQs. Remove the function and
__mlxsw_pci_queue_count() which is now not used and store the value
instead.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/f08ad113e8160678f3c8d401382a696c6c7f44c7.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The number of SDQs is stored as part of 'mlxsw_pci' structure. In some
cases, the driver uses this value and in some cases it calls
mlxsw_pci_sdq_count() to get the value. Align the code to use the
stored value. This simplifies the code and makes it clearer that the
value is always the same. Rename 'mlxsw_pci->num_sdq_cqs' to
'mlxsw_pci->num_sdqs' as now it is used not only in CQ context.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/0c8788506d9af35d589dbf64be35a508fd63d681.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Completion queues are used for completions of RDQ or SDQ. Each
completion queue is used for one DQ. The first CQs are used for SDQs and
the rest are used for RDQs.
Currently, for each CQE (completion queue element), we check 'sr' value
(send/receive) to know if it is completion of RDQ or SDQ. Actually, we
do not really have to check it, as according to the queue number we know
if it handles completions of Rx or Tx.
Break the tasklet into two - one for Rx (RDQ) and one for Tx (SDQ). Then,
setup the appropriate tasklet for each queue as part of queue
initialization. Use 'sr' value for unlikely case that we get completion
with type that we do not expect. Call WARN_ON_ONCE() only after checking
the value, to avoid calling this method for each completion.
A next patch set will use NAPI to handle events, then we will have a
separate poll method for Rx and Tx. This change is a preparation for
NAPI usage.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/50fbc366f8de54cb5dc72a7c4f394333ef71f1d0.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This function will be broken into several functions later. As preparation,
reorder variables to reverse xmas tree.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/7170a8f4429ecb5a539b0374c621697778ff8363.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The previous patch changed the code to do not handle command interface
from event queue. With this change the wait queue is not used anymore.
Remove it and 'wait_done' variable.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/f3af6a5a9dabd97d2920cefe475c6aa57767f504.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The device supports two event queues. EQ0 is used for command interface
completion events. EQ1 is used for completion events of RDQ or SDQ.
Currently, for each EQE (event queue element), we check the queue number
and handle accordingly. More than that, for each interrupt we schedule
tasklets for both EQs. This is really ineffective, especially because of
the fact that EQ0 is used only as part of driver init/fini, when EMADs are
not available. There is no point to schedule the tasklet for it and check
each EQE.
A previous patch changed the code to poll command interface for each use of
it. It means that now there is no real reason to use EQ0, as we poll the
command interface.
Initialize only one event queue and use it as EQ1 (this is determined by
queue number). Then, for each interrupt we can schedule the tasklet only
for one queue and we do not have to check the queue number. This
simplifies the code and should improve performance. Note that polling
command interface is ok as we use it only as part of driver init/fini.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/23d764f5c032e4c363b98590b746a4b32d2bf900.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently we use MLXSW_PCI_EQS_COUNT event queues. A next patch will
change the driver to initialize only EQ1, as EQ0 is not required anymore
when we poll command interface.
Rename the macro to MLXSW_PCI_EQS_MAX as later we will not initialize
the maximum supported EQs, this value represents the maximum and a new
macro will be added to represent the actual used queues.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/b08df430b62f23ca1aa3aaa257896d2d95aa7691.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Command interface is used for configuring and querying FW when EMADs are
not available. During the time that the driver sets up the asynchronous
queues, it polls the command interface for getting completions. Then,
there is a short period when asynchronous queues work, but EMADs are not
available (marked in the code as nopoll = true). During this time, we
send commands via command interface, but we do not poll it, as we can get
an interrupt for the completion. Completions of command interface are
received from HW in EQ0 (event queue 0).
The usage of EQ0 instead of polling is done only 4 times during
initialization and one time during tear down, but it makes an overhead
during lifetime of the driver. For each interrupt, we have to check if
we get events in EQ0 or EQ1 and handle them. This is really ineffective,
especially because of the fact that EQ0 is used only as part of driver
init/fini.
Instead, we can poll command interface for each call of cmd_exec(). It
means that when we send a command via command interface (as EMADs are
not available), we will poll it, regardless of availability of the
asynchronous queues. This will allow us to configure later only EQ1 and
simplify the flow.
Remove 'nopoll' indication and change mlxsw_pci_cmd_exec() to poll till
answer/timeout regardless of queues' state. For now, completions are
handled also by EQ0, but it will be removed in next patch. Additional
cleanups will be added in next patches.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/e674c70380ceda953e0e45a77334c5d22e69938f.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This function will be used later only for EQ1. As preparation, reorder
variables to reverse xmas tree and return earlier when it is possible, to
simplify the code.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/2412d6c135b2a6aedb4484f5d8baab3aecd7b9ae.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The structure 'mlxsw_pci_queue' stores several counters which were consumed
via debugfs. Since commit 9a32562bec ("mlxsw: Remove debugfs interface"),
these counters are not used. Remove them. This makes the 'union u' and
'struct eq' redundant. Maintain 'struct cq' as it will be extended later.
Replace increasing 'q->u.eq.ev_other_count' with WARN_ON_ONCE(), as it is
used in an unreasonable case of receiving event in EQ which is not EQ0 or
EQ1. When the queues are initialized, we check number of event queues and
fail with the print "Unsupported number of queues" in case that the driver
tries to initialize more than two queues.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/ee9e658800aa0390e08342100bc27daff4c176c0.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, as part of mlxsw_pci_cq_tasklet(), we check if any item
was handled, and only in such case we arm doorbell. This is unlikely case,
as we schedule tasklet only for CQs that we get an event for them, which
means that they contain completions to handle. Remove this check, which
is supposed to be true always, and even if it is false, it is not a mistake
to ring the doorbell. We can warn on such case, but it is not really worth
to add a check which will be run for each CQ handling when we do not expect
to reach it and it does not point to logic error that should be handled.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/f8efa481bfe7bebb9f93bb803f44ab7da77f53e6.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Currently, the structure 'mlxsw_pci_queue_ops' holds a pointer to the
callback function of tasklet. This is used only for EQ and CQ. mlxsw
driver will use NAPI in a following patch set, so CQ will not use tasklet
anymore. As preparation, remove this pointer from the shared operation
structure and setup the tasklet as part of queue initialization.
For now, setup tasklet for EQ and CQ. Later, CQ code will be changed.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/a326cae5fc1ad085a1a063c004983de6fe389414.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move mlxsw_pci_cq_{init, fini}() after mlxsw_pci_cq_tasklet() as a next
patch will setup the tasklet as part of initialization.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/25196cb5baf5acf6ec1e956203790e018ba8e306.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Move mlxsw_pci_eq_{init, fini}() after mlxsw_pci_eq_tasklet() as a next
patch will setup the tasklet as part of initialization.
Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/7ae120a02e1c490084daae7e684a0d40b7cce4e7.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Tariq Toukan says:
====================
mlx5 misc patches
This patchset includes small features and misc code enhancements for the
mlx5 core and EN drivers.
Patches 1-4 by Gal improves the mlx5e ethtool stats implementation, for
example by using standard helpers ethtool_sprintf/puts.
Patch 5 by me adds a reset option for the FW command interface debugfs
stats entries. This allows explicit FW command interface stats reset
between different runs of a test case.
Patches 6 and 7 are simple cleanups.
Patch 8 by Gal adds driver support for 800Gbps link modes.
Patch 9 by Jianbo enhances the L4 steering abilities.
Patches 10-11 by Jianbo save redundant operations.
====================
Link: https://lore.kernel.org/r/20240402133043.56322-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Firmware will return 0 on query BOOT/INIT PAGES for non-page supplier
functions (external host PF/VF/SF), so no page is needed to be
allocated for them.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240402133043.56322-12-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Page events are not issued by device on the function if
page_request_disable is set, so no need to create pages EQ.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240402133043.56322-11-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Replace matching on TCP and UDP protocols with new l4_type field which
is parsed by steering for ttc_table. It is enabled by the
outer_l4_type or inner_l4_type bits in nic_rx or port_sel flow table
capabilities and used only if pcc_ifa2 bit in HCA capabilities is set.
Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240402133043.56322-10-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Add support for 800Gbps speed, link modes of 100Gbps per lane.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240402133043.56322-9-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In the kernel, the preferred types are uX.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Starting from commit
eb9b9fdcaf ("net/mlx5e: Introduce extended version for mlx5e_xmit_data")
sinfo is no longer passed as an argument to
mlx5e_xmit_xdp_frame(), the comment is inconsistent.
check_result must be zero when the packet is fragmented.
Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Resetting stats just before some test/debug case allows us to eliminate
out the impact of previous commands. Useful in particular for the
average latency calculation.
The average_write() callback was unreachable, as "average" is a
read-only file. Extend, rename, and use it for a newly exposed
write-only "reset" file.
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The fill_strings() callbacks were changed to accept a **data pointer,
and not rely on propagating the index value.
Make a similar change to fill_stats() callbacks to keep the API
consistent.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use ethtool_sprintf/puts() helper functions which handle the common
pattern of printing a string into the ethtool strings interface and
incrementing the string pointer by ETH_GSTRING_LEN.
Change the fill_strings callback to accept a **data pointer, and remove
the index and return value.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use ethtool_sprintf/puts() helper functions which handle the common
pattern of printing a string into the ethtool strings interface and
incrementing the string pointer by ETH_GSTRING_LEN.
The int return value in mlx5e_self_test_fill_strings() is not removed as
it is still used to return the number of selftests.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Use ethtool_sprintf/puts() helper functions which handle the common
pattern of printing a string into the ethtool strings interface and
incrementing the string pointer by ETH_GSTRING_LEN.
Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The first "new features" pull request for v6.10 with changes both in
stack and in drivers. The big thing in this pull request is that
wireless subsystem is now almost free of sparse warnings. There's only
one warning left in ath11k which was introduced in v6.9-rc1 and will
be fixed via the wireless tree.
Realtek drivers continue to improve, now we have support for RTL8922AE
and RTL8723CS devices. ath11k also has long waited support for P2P.
This time we have a small conflict in iwlwifi as we didn't consider it
as major enough to justify merging wireless tree to wireless-next. But
Stephen has an example merge resolution which should help with fixing
the conflict:
https://lore.kernel.org/all/20240326100945.765b8caf@canb.auug.org.au/
Major changes:
rtw89
* RTL8922AE Wi-Fi 7 PCI device support
rtw88
* RTL8723CS SDIO device support
iwlwifi
* don't support puncturing in 5 GHz
* support monitor mode on passive channels
* BZ-W device support
* P2P with HE/EHT support
ath11k
* P2P support for QCA6390, WCN6855 and QCA2066
-----BEGIN PGP SIGNATURE-----
iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmYNIqIRHGt2YWxvQGtl
cm5lbC5vcmcACgkQbhckVSbrbZt8jAf9H+o91boD34/qVdI5LWEcFhVKEkHpNtwm
Y1sTKNBEtN1Gs2zcljjO6PqN9N4v2+lA42KSpzP5M42FfpI2aATI2v8jYsKTXOl2
YVwF+8pDiAsi0YtQTxIthygjzTpsePCfj8z0xJaKGm195T+fMm9UebYETrfxxOp/
z5StsJIPI0twgSLKKUWvLpX4ESt0l0HLJY1ok99sk4Cj36EKn6b9LbBinDKr6GcQ
mGNtPyq0j4l0kS5qae9BbXZUohO54o8wiFnApdwGfA7S/kLY7eUtwZy7T050b62P
zbNafwZbIjrH7dNcGfe6Fdr7PjQYFeI5Nh7dXxqM2LJOQsYXU/tcWQ==
=WrPE
-----END PGP SIGNATURE-----
Merge tag 'wireless-next-2024-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next
Kalle Valo says:
====================
wireless-next patches for v6.10
The first "new features" pull request for v6.10 with changes both in
stack and in drivers. The big thing in this pull request is that
wireless subsystem is now almost free of sparse warnings. There's only
one warning left in ath11k which was introduced in v6.9-rc1 and will
be fixed via the wireless tree.
Realtek drivers continue to improve, now we have support for RTL8922AE
and RTL8723CS devices. ath11k also has long waited support for P2P.
This time we have a small conflict in iwlwifi, Stephen has an example
merge resolution which should help with fixing the conflict:
https://lore.kernel.org/all/20240326100945.765b8caf@canb.auug.org.au/
Major changes:
rtw89
* RTL8922AE Wi-Fi 7 PCI device support
rtw88
* RTL8723CS SDIO device support
iwlwifi
* don't support puncturing in 5 GHz
* support monitor mode on passive channels
* BZ-W device support
* P2P with HE/EHT support
ath11k
* P2P support for QCA6390, WCN6855 and QCA2066
* tag 'wireless-next-2024-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (122 commits)
wifi: mt76: mt7915: workaround dubious x | !y warning
wifi: mwl8k: Avoid -Wflex-array-member-not-at-end warnings
wifi: ti: Avoid a hundred -Wflex-array-member-not-at-end warnings
wifi: iwlwifi: mvm: fix check in iwl_mvm_sta_fw_id_mask
net: rfkill: gpio: Convert to platform remove callback returning void
wifi: mac80211: use kvcalloc() for codel vars
wifi: iwlwifi: reconfigure TLC during HW restart
wifi: iwlwifi: mvm: don't change BA sessions during restart
wifi: iwlwifi: mvm: select STA mask only for active links
wifi: iwlwifi: mvm: set wider BW OFDMA ignore correctly
wifi: iwlwifi: Add support for LARI_CONFIG_CHANGE_CMD cmd v9
wifi: iwlwifi: mvm: Declare HE/EHT capabilities support for P2P interfaces
wifi: iwlwifi: mvm: Remove outdated comment
wifi: iwlwifi: add support for BZ_W
wifi: iwlwifi: Print a specific device name.
wifi: iwlwifi: remove wrong CRF_IDs
wifi: iwlwifi: remove devices that never came out
wifi: iwlwifi: mvm: mark EMLSR disabled in cleanup iterator
wifi: iwlwifi: mvm: fix active link counting during recovery
wifi: iwlwifi: mvm: assign link STA ID lookups during restart
...
====================
Link: https://lore.kernel.org/r/20240403093625.CF515C433C7@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ethtool.py depends on yml files in a specific location of the linux kernel
tree. Using relative lookup for those files means that ethtool.py would
need to be run under tools/net/ynl/. Lookup needed yml files without
depending on the current working directory that ethtool.py is invoked from.
Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://lore.kernel.org/r/20240402204000.115081-1-rrameshbabu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit implements VCT in 88E308X/88E609X Family.
It require two workarounds with some magic configuration.
Regular use require only one register configuration. But Open Circuit
require second workaround.
It cause implementation two phases for fault length measuring.
Fast Ethernet PHY have implemented very simple version of VCT. It's
complitley different than vct5 or vct7.
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240402201123.2961909-3-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Some PHYs can recognize during a cable test if the impedance in the cable
is okay. They can detect reflections caused by impedance discontinuity
between a regular 100 Ohm cable and an abnormal part with a higher or
lower impedance.
This commit introduces a new result code:
ETHTOOL_A_CABLE_RESULT_CODE_IMPEDANCE_MISMATCH,
which represents the results of a cable test indicating issues with
impedance integrity.
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240402201123.2961909-2-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This patch implements only basic support.
It covers PHY used in multiple IC:
PHY: 88E3082, 88E3083
Switch: 88E6096, 88E6097
Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20240402201123.2961909-1-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
mana_get_rxbuf_cfg() aligns the RX buffer's DMA datasize to be
multiple of 64. So a packet slightly bigger than mtu+14, say 1536,
can be received and cause skb_over_panic.
Sample dmesg:
[ 5325.237162] skbuff: skb_over_panic: text:ffffffffc043277a len:1536 put:1536 head:ff1100018b517000 data:ff1100018b517100 tail:0x700 end:0x6ea dev:<NULL>
[ 5325.243689] ------------[ cut here ]------------
[ 5325.245748] kernel BUG at net/core/skbuff.c:192!
[ 5325.247838] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 5325.258374] RIP: 0010:skb_panic+0x4f/0x60
[ 5325.302941] Call Trace:
[ 5325.304389] <IRQ>
[ 5325.315794] ? skb_panic+0x4f/0x60
[ 5325.317457] ? asm_exc_invalid_op+0x1f/0x30
[ 5325.319490] ? skb_panic+0x4f/0x60
[ 5325.321161] skb_put+0x4e/0x50
[ 5325.322670] mana_poll+0x6fa/0xb50 [mana]
[ 5325.324578] __napi_poll+0x33/0x1e0
[ 5325.326328] net_rx_action+0x12e/0x280
As discussed internally, this alignment is not necessary. To fix
this bug, remove it from the code. So oversized packets will be
marked as CQE_RX_TRUNCATED by NIC, and dropped.
Cc: stable@vger.kernel.org
Fixes: 2fbbd712ba ("net: mana: Enable RX path to handle various MTU sizes")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1712087316-20886-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
In "struct muram_info", the 'size' field is unused.
In "struct memac_cfg", the 'fixed_link' field is unused.
Remove them.
Found with cppcheck, unusedStructMember.
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Sean Anderson <sean.anderson@seco.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/425222d4f6c584e8316ccb7b2ef415a85c96e455.1712084103.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Kuniyuki Iwashima says:
====================
af_unix: Remove old GC leftovers.
This is a follow-up series for commit 4090fa373f ("af_unix: Replace
garbage collection algorithm.") which introduced the new GC for AF_UNIX.
Now we no longer need two ugly tricks for the old GC, let's remove them.
====================
Link: https://lore.kernel.org/r/20240401173125.92184-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>