Commit Graph

1265631 Commits

Author SHA1 Message Date
linke li
04172043bd net: ethernet: mtk_eth_soc: Reuse value using READ_ONCE instead of re-rereading it
In mtk_flow_entry_update_l2, the hwe->ib1 is read using READ_ONCE at the
beginning of the function, checked, and then re-read from hwe->ib1,
may void all guarantees of the checks. Reuse the value that was read by
READ_ONCE to ensure the consistency of the ib1 throughout the function.

Signed-off-by: linke li <lilinke99@qq.com>
Link: https://lore.kernel.org/r/tencent_C699E9540505523424F11A9BD3D21B86840A@qq.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-04 15:46:52 +02:00
Christophe JAILLET
c120209bce net: dsa: sja1105: Fix parameters order in sja1110_pcs_mdio_write_c45()
The definition and declaration of sja1110_pcs_mdio_write_c45() don't have
parameters in the same order.

Knowing that sja1110_pcs_mdio_write_c45() is used as a function pointer
in 'sja1105_info' structure with .pcs_mdio_write_c45, and that we have:

   int (*pcs_mdio_write_c45)(struct mii_bus *bus, int phy, int mmd,
				  int reg, u16 val);

it is likely that the definition is the one to change.

Found with cppcheck, funcArgOrderDifferent.

Fixes: ae271547bb ("net: dsa: sja1105: C45 only transactions for PCS")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Michael Walle <mwalle@kernel.org>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://lore.kernel.org/r/ff2a5af67361988b3581831f7bd1eddebfb4c48f.1712082763.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-04 12:51:45 +02:00
Paul Barker
101b76418d net: ravb: Always update error counters
The error statistics should be updated each time the poll function is
called, even if the full RX work budget has been consumed. This prevents
the counts from becoming stuck when RX bandwidth usage is high.

This also ensures that error counters are not updated after we've
re-enabled interrupts as that could result in a race condition.

Also drop an unnecessary space.

Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/20240402145305.82148-2-paul.barker.ct@bp.renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-04 12:46:13 +02:00
Paul Barker
596a425491 net: ravb: Always process TX descriptor ring
The TX queue should be serviced each time the poll function is called,
even if the full RX work budget has been consumed. This prevents
starvation of the TX queue when RX bandwidth usage is high.

Fixes: c156633f13 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Link: https://lore.kernel.org/r/20240402145305.82148-1-paul.barker.ct@bp.renesas.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-04 12:46:13 +02:00
Pablo Neira Ayuso
1bc83a019b netfilter: nf_tables: discard table flag update with pending basechain deletion
Hook unregistration is deferred to the commit phase, same occurs with
hook updates triggered by the table dormant flag. When both commands are
combined, this results in deleting a basechain while leaving its hook
still registered in the core.

Fixes: 179d9ba555 ("netfilter: nf_tables: fix table flag updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-04-04 11:38:35 +02:00
Ziyang Xuan
24225011d8 netfilter: nf_tables: Fix potential data-race in __nft_flowtable_type_get()
nft_unregister_flowtable_type() within nf_flow_inet_module_exit() can
concurrent with __nft_flowtable_type_get() within nf_tables_newflowtable().
And thhere is not any protection when iterate over nf_tables_flowtables
list in __nft_flowtable_type_get(). Therefore, there is pertential
data-race of nf_tables_flowtables list entry.

Use list_for_each_entry_rcu() to iterate over nf_tables_flowtables list
in __nft_flowtable_type_get(), and use rcu_read_lock() in the caller
nft_flowtable_type_get() to protect the entire type query process.

Fixes: 3b49e2e94e ("netfilter: nf_tables: add flow table netlink frontend")
Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-04-04 11:38:34 +02:00
Pablo Neira Ayuso
994209ddf4 netfilter: nf_tables: reject new basechain after table flag update
When dormant flag is toggled, hooks are disabled in the commit phase by
iterating over current chains in table (existing and new).

The following configuration allows for an inconsistent state:

  add table x
  add chain x y { type filter hook input priority 0; }
  add table x { flags dormant; }
  add chain x w { type filter hook input priority 1; }

which triggers the following warning when trying to unregister chain w
which is already unregistered.

[  127.322252] WARNING: CPU: 7 PID: 1211 at net/netfilter/core.c:50                                                                     1 __nf_unregister_net_hook+0x21a/0x260
[...]
[  127.322519] Call Trace:
[  127.322521]  <TASK>
[  127.322524]  ? __warn+0x9f/0x1a0
[  127.322531]  ? __nf_unregister_net_hook+0x21a/0x260
[  127.322537]  ? report_bug+0x1b1/0x1e0
[  127.322545]  ? handle_bug+0x3c/0x70
[  127.322552]  ? exc_invalid_op+0x17/0x40
[  127.322556]  ? asm_exc_invalid_op+0x1a/0x20
[  127.322563]  ? kasan_save_free_info+0x3b/0x60
[  127.322570]  ? __nf_unregister_net_hook+0x6a/0x260
[  127.322577]  ? __nf_unregister_net_hook+0x21a/0x260
[  127.322583]  ? __nf_unregister_net_hook+0x6a/0x260
[  127.322590]  ? __nf_tables_unregister_hook+0x8a/0xe0 [nf_tables]
[  127.322655]  nft_table_disable+0x75/0xf0 [nf_tables]
[  127.322717]  nf_tables_commit+0x2571/0x2620 [nf_tables]

Fixes: 179d9ba555 ("netfilter: nf_tables: fix table flag updates")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-04-04 11:34:42 +02:00
Pablo Neira Ayuso
24cea96770 netfilter: nf_tables: flush pending destroy work before exit_net release
Similar to 2c9f029328 ("netfilter: nf_tables: flush pending destroy
work before netlink notifier") to address a race between exit_net and
the destroy workqueue.

The trace below shows an element to be released via destroy workqueue
while exit_net path (triggered via module removal) has already released
the set that is used in such transaction.

[ 1360.547789] BUG: KASAN: slab-use-after-free in nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
[ 1360.547861] Read of size 8 at addr ffff888140500cc0 by task kworker/4:1/152465
[ 1360.547870] CPU: 4 PID: 152465 Comm: kworker/4:1 Not tainted 6.8.0+ #359
[ 1360.547882] Workqueue: events nf_tables_trans_destroy_work [nf_tables]
[ 1360.547984] Call Trace:
[ 1360.547991]  <TASK>
[ 1360.547998]  dump_stack_lvl+0x53/0x70
[ 1360.548014]  print_report+0xc4/0x610
[ 1360.548026]  ? __virt_addr_valid+0xba/0x160
[ 1360.548040]  ? __pfx__raw_spin_lock_irqsave+0x10/0x10
[ 1360.548054]  ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
[ 1360.548176]  kasan_report+0xae/0xe0
[ 1360.548189]  ? nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
[ 1360.548312]  nf_tables_trans_destroy_work+0x3f5/0x590 [nf_tables]
[ 1360.548447]  ? __pfx_nf_tables_trans_destroy_work+0x10/0x10 [nf_tables]
[ 1360.548577]  ? _raw_spin_unlock_irq+0x18/0x30
[ 1360.548591]  process_one_work+0x2f1/0x670
[ 1360.548610]  worker_thread+0x4d3/0x760
[ 1360.548627]  ? __pfx_worker_thread+0x10/0x10
[ 1360.548640]  kthread+0x16b/0x1b0
[ 1360.548653]  ? __pfx_kthread+0x10/0x10
[ 1360.548665]  ret_from_fork+0x2f/0x50
[ 1360.548679]  ? __pfx_kthread+0x10/0x10
[ 1360.548690]  ret_from_fork_asm+0x1a/0x30
[ 1360.548707]  </TASK>

[ 1360.548719] Allocated by task 192061:
[ 1360.548726]  kasan_save_stack+0x20/0x40
[ 1360.548739]  kasan_save_track+0x14/0x30
[ 1360.548750]  __kasan_kmalloc+0x8f/0xa0
[ 1360.548760]  __kmalloc_node+0x1f1/0x450
[ 1360.548771]  nf_tables_newset+0x10c7/0x1b50 [nf_tables]
[ 1360.548883]  nfnetlink_rcv_batch+0xbc4/0xdc0 [nfnetlink]
[ 1360.548909]  nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink]
[ 1360.548927]  netlink_unicast+0x367/0x4f0
[ 1360.548935]  netlink_sendmsg+0x34b/0x610
[ 1360.548944]  ____sys_sendmsg+0x4d4/0x510
[ 1360.548953]  ___sys_sendmsg+0xc9/0x120
[ 1360.548961]  __sys_sendmsg+0xbe/0x140
[ 1360.548971]  do_syscall_64+0x55/0x120
[ 1360.548982]  entry_SYSCALL_64_after_hwframe+0x55/0x5d

[ 1360.548994] Freed by task 192222:
[ 1360.548999]  kasan_save_stack+0x20/0x40
[ 1360.549009]  kasan_save_track+0x14/0x30
[ 1360.549019]  kasan_save_free_info+0x3b/0x60
[ 1360.549028]  poison_slab_object+0x100/0x180
[ 1360.549036]  __kasan_slab_free+0x14/0x30
[ 1360.549042]  kfree+0xb6/0x260
[ 1360.549049]  __nft_release_table+0x473/0x6a0 [nf_tables]
[ 1360.549131]  nf_tables_exit_net+0x170/0x240 [nf_tables]
[ 1360.549221]  ops_exit_list+0x50/0xa0
[ 1360.549229]  free_exit_list+0x101/0x140
[ 1360.549236]  unregister_pernet_operations+0x107/0x160
[ 1360.549245]  unregister_pernet_subsys+0x1c/0x30
[ 1360.549254]  nf_tables_module_exit+0x43/0x80 [nf_tables]
[ 1360.549345]  __do_sys_delete_module+0x253/0x370
[ 1360.549352]  do_syscall_64+0x55/0x120
[ 1360.549360]  entry_SYSCALL_64_after_hwframe+0x55/0x5d

(gdb) list *__nft_release_table+0x473
0x1e033 is in __nft_release_table (net/netfilter/nf_tables_api.c:11354).
11349           list_for_each_entry_safe(flowtable, nf, &table->flowtables, list) {
11350                   list_del(&flowtable->list);
11351                   nft_use_dec(&table->use);
11352                   nf_tables_flowtable_destroy(flowtable);
11353           }
11354           list_for_each_entry_safe(set, ns, &table->sets, list) {
11355                   list_del(&set->list);
11356                   nft_use_dec(&table->use);
11357                   if (set->flags & (NFT_SET_MAP | NFT_SET_OBJECT))
11358                           nft_map_deactivate(&ctx, set);
(gdb)

[ 1360.549372] Last potentially related work creation:
[ 1360.549376]  kasan_save_stack+0x20/0x40
[ 1360.549384]  __kasan_record_aux_stack+0x9b/0xb0
[ 1360.549392]  __queue_work+0x3fb/0x780
[ 1360.549399]  queue_work_on+0x4f/0x60
[ 1360.549407]  nft_rhash_remove+0x33b/0x340 [nf_tables]
[ 1360.549516]  nf_tables_commit+0x1c6a/0x2620 [nf_tables]
[ 1360.549625]  nfnetlink_rcv_batch+0x728/0xdc0 [nfnetlink]
[ 1360.549647]  nfnetlink_rcv+0x1a8/0x1e0 [nfnetlink]
[ 1360.549671]  netlink_unicast+0x367/0x4f0
[ 1360.549680]  netlink_sendmsg+0x34b/0x610
[ 1360.549690]  ____sys_sendmsg+0x4d4/0x510
[ 1360.549697]  ___sys_sendmsg+0xc9/0x120
[ 1360.549706]  __sys_sendmsg+0xbe/0x140
[ 1360.549715]  do_syscall_64+0x55/0x120
[ 1360.549725]  entry_SYSCALL_64_after_hwframe+0x55/0x5d

Fixes: 0935d55884 ("netfilter: nf_tables: asynchronous release")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-04-04 11:34:42 +02:00
Pablo Neira Ayuso
0d459e2ffb netfilter: nf_tables: release mutex after nft_gc_seq_end from abort path
The commit mutex should not be released during the critical section
between nft_gc_seq_begin() and nft_gc_seq_end(), otherwise, async GC
worker could collect expired objects and get the released commit lock
within the same GC sequence.

nf_tables_module_autoload() temporarily releases the mutex to load
module dependencies, then it goes back to replay the transaction again.
Move it at the end of the abort phase after nft_gc_seq_end() is called.

Cc: stable@vger.kernel.org
Fixes: 720344340f ("netfilter: nf_tables: GC transaction race with abort path")
Reported-by: Kuan-Ting Chen <hexrabbit@devco.re>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-04-04 11:34:42 +02:00
Pablo Neira Ayuso
a45e688957 netfilter: nf_tables: release batch on table validation from abort path
Unlike early commit path stage which triggers a call to abort, an
explicit release of the batch is required on abort, otherwise mutex is
released and commit_list remains in place.

Add WARN_ON_ONCE to ensure commit_list is empty from the abort path
before releasing the mutex.

After this patch, commit_list is always assumed to be empty before
grabbing the mutex, therefore

  03c1f1ef15 ("netfilter: Cleanup nft_net->module_list from nf_tables_exit_net()")

only needs to release the pending modules for registration.

Cc: stable@vger.kernel.org
Fixes: c0391b6ab8 ("netfilter: nf_tables: missing validation from the abort path")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2024-04-04 11:34:41 +02:00
Paolo Abeni
72076fc9fe Revert "tg3: Remove residual error handling in tg3_suspend"
This reverts commit 9ab4ad2956.

I went out of coffee and applied it to the wrong tree. Blame on me.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-04 10:51:01 +02:00
Nikita Kiryushin
d72b735712 tg3: Remove residual error handling in tg3_suspend
As of now, tg3_power_down_prepare always ends with success, but
the error handling code from former tg3_set_power_state call is still here.

This code became unreachable in commit c866b7eac0 ("tg3: Do not use
legacy PCI power management").

Remove (now unreachable) error handling code for simplification and change
tg3_power_down_prepare to a void function as its result is no more checked.

Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240401191418.361747-1-kiryushin@ancud.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-04 10:49:42 +02:00
Nikita Kiryushin
9ab4ad2956 tg3: Remove residual error handling in tg3_suspend
As of now, tg3_power_down_prepare always ends with success, but
the error handling code from former tg3_set_power_state call is still here.

This code became unreachable in commit c866b7eac0 ("tg3: Do not use
legacy PCI power management").

Remove (now unreachable) error handling code for simplification and change
tg3_power_down_prepare to a void function as its result is no more checked.

Signed-off-by: Nikita Kiryushin <kiryushin@ancud.ru>
Reviewed-by: Michael Chan <michael.chan@broadcom.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240401191418.361747-1-kiryushin@ancud.ru
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-04-04 10:16:50 +02:00
Jakub Kicinski
57a03d83f2 Merge branch 'mlxsw-preparations-for-improving-performance'
Petr Machata says:

====================
mlxsw: Preparations for improving performance

Amit Cohen writes:

mlxsw driver will use NAPI for event processing in a next patch set.
Some additional improvements will be added later. This patch set
prepares the code for NAPI usage and refactor some relevant areas. See
more details in commit messages.

Patch Set overview:
Patches #1-#2 are preparations for patch #3
Patch #3 setups tasklets as part of queue initializtion
Patch #4 removes handling of unlikely scenario
Patch #5 removes unused counters
Patch #6 makes style change in mlxsw_pci_eq_tasklet()
Patch #7-#10 poll command interface instead of EQ0 usage
Patches #11-#12 make style change and break the function
mlxsw_pci_cq_tasklet()
Patches #13-#14 remove functions which can be replaced by a stored value
Patch #15 improves accessing to descriptor queue instance
====================

Link: https://lore.kernel.org/r/cover.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:44 -07:00
Amit Cohen
77c6e27df9 mlxsw: pci: Store DQ pointer as part of CQ structure
Currently, for each completion, we check the number of descriptor queue
and take it via mlxsw_pci_{sdq,rdq}_get(). This is inefficient, the
DQ should be the same for all the completions in CQ, as each CQ handles
only one DQ - SDQ or RDQ. This mapping is handled as part of DQ
initialization via mlxsw_cmd_mbox_sw2hw_dq_cq_set().

Instead, as part of DQ initialization, set DQ pointer in the appropriate
CQ structure. When we handle completions, warn in case that the DQ number
that we expect is different from the number we get in the CQE. Call
WARN_ON_ONCE() only after checking the value, to avoid calling this method
for each completion.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/a5b2559cd6d532c120f3194f89a1e257110318f1.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:41 -07:00
Amit Cohen
82238f0ddb mlxsw: pci: Remove mlxsw_pci_cq_count()
Currently, for each interrupt we call mlxsw_pci_cq_count() to determine the
number of CQs. This call makes additional two function's calls. This can
be removed by storing this value as part of structure 'mlxsw_pci', as we
already do for number of SDQs. Remove the function and
__mlxsw_pci_queue_count() which is now not used and store the value
instead.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/f08ad113e8160678f3c8d401382a696c6c7f44c7.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:41 -07:00
Amit Cohen
0cd1453b7e mlxsw: pci: Remove mlxsw_pci_sdq_count()
The number of SDQs is stored as part of 'mlxsw_pci' structure. In some
cases, the driver uses this value and in some cases it calls
mlxsw_pci_sdq_count() to get the value. Align the code to use the
stored value. This simplifies the code and makes it clearer that the
value is always the same. Rename 'mlxsw_pci->num_sdq_cqs' to
'mlxsw_pci->num_sdqs' as now it is used not only in CQ context.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/0c8788506d9af35d589dbf64be35a508fd63d681.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:41 -07:00
Amit Cohen
1df7d871e3 mlxsw: pci: Break mlxsw_pci_cq_tasklet() into tasklets per queue type
Completion queues are used for completions of RDQ or SDQ. Each
completion queue is used for one DQ. The first CQs are used for SDQs and
the rest are used for RDQs.

Currently, for each CQE (completion queue element), we check 'sr' value
(send/receive) to know if it is completion of RDQ or SDQ. Actually, we
do not really have to check it, as according to the queue number we know
if it handles completions of Rx or Tx.

Break the tasklet into two - one for Rx (RDQ) and one for Tx (SDQ). Then,
setup the appropriate tasklet for each queue as part of queue
initialization. Use 'sr' value for unlikely case that we get completion
with type that we do not expect. Call WARN_ON_ONCE() only after checking
the value, to avoid calling this method for each completion.

A next patch set will use NAPI to handle events, then we will have a
separate poll method for Rx and Tx. This change is a preparation for
NAPI usage.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/50fbc366f8de54cb5dc72a7c4f394333ef71f1d0.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:40 -07:00
Amit Cohen
a0639236d4 mlxsw: pci: Make style change in mlxsw_pci_cq_tasklet()
This function will be broken into several functions later. As preparation,
reorder variables to reverse xmas tree.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/7170a8f4429ecb5a539b0374c621697778ff8363.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:40 -07:00
Amit Cohen
2c200863fc mlxsw: pci: Remove unused wait queue
The previous patch changed the code to do not handle command interface
from event queue. With this change the wait queue is not used anymore.
Remove it and 'wait_done' variable.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/f3af6a5a9dabd97d2920cefe475c6aa57767f504.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:40 -07:00
Amit Cohen
6fc280a365 mlxsw: pci: Use only one event queue
The device supports two event queues. EQ0 is used for command interface
completion events. EQ1 is used for completion events of RDQ or SDQ.

Currently, for each EQE (event queue element), we check the queue number
and handle accordingly. More than that, for each interrupt we schedule
tasklets for both EQs. This is really ineffective, especially because of
the fact that EQ0 is used only as part of driver init/fini, when EMADs are
not available. There is no point to schedule the tasklet for it and check
each EQE.

A previous patch changed the code to poll command interface for each use of
it. It means that now there is no real reason to use EQ0, as we poll the
command interface.

Initialize only one event queue and use it as EQ1 (this is determined by
queue number). Then, for each interrupt we can schedule the tasklet only
for one queue and we do not have to check the queue number. This
simplifies the code and should improve performance. Note that polling
command interface is ok as we use it only as part of driver init/fini.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/23d764f5c032e4c363b98590b746a4b32d2bf900.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:40 -07:00
Amit Cohen
7bc6a3098c mlxsw: pci: Rename MLXSW_PCI_EQS_COUNT
Currently we use MLXSW_PCI_EQS_COUNT event queues. A next patch will
change the driver to initialize only EQ1, as EQ0 is not required anymore
when we poll command interface.

Rename the macro to MLXSW_PCI_EQS_MAX as later we will not initialize
the maximum supported EQs, this value represents the maximum and a new
macro will be added to represent the actual used queues.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/b08df430b62f23ca1aa3aaa257896d2d95aa7691.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:40 -07:00
Amit Cohen
d4b3930b19 mlxsw: pci: Poll command interface for each cmd_exec()
Command interface is used for configuring and querying FW when EMADs are
not available. During the time that the driver sets up the asynchronous
queues, it polls the command interface for getting completions. Then,
there is a short period when asynchronous queues work, but EMADs are not
available (marked in the code as nopoll = true). During this time, we
send commands via command interface, but we do not poll it, as we can get
an interrupt for the completion. Completions of command interface are
received from HW in EQ0 (event queue 0).

The usage of EQ0 instead of polling is done only 4 times during
initialization and one time during tear down, but it makes an overhead
during lifetime of the driver. For each interrupt, we have to check if
we get events in EQ0 or EQ1 and handle them. This is really ineffective,
especially because of the fact that EQ0 is used only as part of driver
init/fini.

Instead, we can poll command interface for each call of cmd_exec(). It
means that when we send a command via command interface (as EMADs are
not available), we will poll it, regardless of availability of the
asynchronous queues. This will allow us to configure later only EQ1 and
simplify the flow.

Remove 'nopoll' indication and change mlxsw_pci_cmd_exec() to poll till
answer/timeout regardless of queues' state. For now, completions are
handled also by EQ0, but it will be removed in next patch. Additional
cleanups will be added in next patches.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/e674c70380ceda953e0e45a77334c5d22e69938f.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:40 -07:00
Amit Cohen
29ad2a9906 mlxsw: pci: Make style changes in mlxsw_pci_eq_tasklet()
This function will be used later only for EQ1. As preparation, reorder
variables to reverse xmas tree and return earlier when it is possible, to
simplify the code.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/2412d6c135b2a6aedb4484f5d8baab3aecd7b9ae.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:40 -07:00
Amit Cohen
57beea8e56 mlxsw: pci: Remove unused counters
The structure 'mlxsw_pci_queue' stores several counters which were consumed
via debugfs. Since commit 9a32562bec ("mlxsw: Remove debugfs interface"),
these counters are not used. Remove them. This makes the 'union u' and
'struct eq' redundant. Maintain 'struct cq' as it will be extended later.

Replace increasing 'q->u.eq.ev_other_count' with WARN_ON_ONCE(), as it is
used in an unreasonable case of receiving event in EQ which is not EQ0 or
EQ1. When the queues are initialized, we check number of event queues and
fail with the print "Unsupported number of queues" in case that the driver
tries to initialize more than two queues.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/ee9e658800aa0390e08342100bc27daff4c176c0.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:39 -07:00
Amit Cohen
38b124cb4e mlxsw: pci: Arm CQ doorbell regardless of number of completions
Currently, as part of mlxsw_pci_cq_tasklet(), we check if any item
was handled, and only in such case we arm doorbell. This is unlikely case,
as we schedule tasklet only for CQs that we get an event for them, which
means that they contain completions to handle. Remove this check, which
is supposed to be true always, and even if it is false, it is not a mistake
to ring the doorbell. We can warn on such case, but it is not really worth
to add a check which will be run for each CQ handling when we do not expect
to reach it and it does not point to logic error that should be handled.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/f8efa481bfe7bebb9f93bb803f44ab7da77f53e6.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:39 -07:00
Amit Cohen
fb29028ae7 mlxsw: pci: Do not setup tasklet from operation
Currently, the structure 'mlxsw_pci_queue_ops' holds a pointer to the
callback function of tasklet. This is used only for EQ and CQ. mlxsw
driver will use NAPI in a following patch set, so CQ will not use tasklet
anymore. As preparation, remove this pointer from the shared operation
structure and setup the tasklet as part of queue initialization.
For now, setup tasklet for EQ and CQ. Later, CQ code will be changed.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/a326cae5fc1ad085a1a063c004983de6fe389414.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:39 -07:00
Amit Cohen
f46de9f0e7 mlxsw: pci: Move mlxsw_pci_cq_{init, fini}()
Move mlxsw_pci_cq_{init, fini}() after mlxsw_pci_cq_tasklet() as a next
patch will setup the tasklet as part of initialization.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/25196cb5baf5acf6ec1e956203790e018ba8e306.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:39 -07:00
Amit Cohen
d38718a525 mlxsw: pci: Move mlxsw_pci_eq_{init, fini}()
Move mlxsw_pci_eq_{init, fini}() after mlxsw_pci_eq_tasklet() as a next
patch will setup the tasklet as part of initialization.

Signed-off-by: Amit Cohen <amcohen@nvidia.com>
Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/7ae120a02e1c490084daae7e684a0d40b7cce4e7.1712062203.git.petrm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:50:39 -07:00
Jakub Kicinski
6b164687f8 Merge branch 'mlx5-misc-patches'
Tariq Toukan says:

====================
mlx5 misc patches

This patchset includes small features and misc code enhancements for the
mlx5 core and EN drivers.

Patches 1-4 by Gal improves the mlx5e ethtool stats implementation, for
example by using standard helpers ethtool_sprintf/puts.

Patch 5 by me adds a reset option for the FW command interface debugfs
stats entries. This allows explicit FW command interface stats reset
between different runs of a test case.

Patches 6 and 7 are simple cleanups.

Patch 8 by Gal adds driver support for 800Gbps link modes.

Patch 9 by Jianbo enhances the L4 steering abilities.

Patches 10-11 by Jianbo save redundant operations.
====================

Link: https://lore.kernel.org/r/20240402133043.56322-1-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:48:01 -07:00
Jianbo Liu
07e1bc785a net/mlx5: Don't call give_pages() if request 0 page
Firmware will return 0 on query BOOT/INIT PAGES for non-page supplier
functions (external host PF/VF/SF), so no page is needed to be
allocated for them.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240402133043.56322-12-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:47:59 -07:00
Jianbo Liu
c788d79cfa net/mlx5: Skip pages EQ creation for non-page supplier function
Page events are not issued by device on the function if
page_request_disable is set, so no need to create pages EQ.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Parav Pandit <parav@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240402133043.56322-11-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:47:59 -07:00
Jianbo Liu
137f3d50ad net/mlx5: Support matching on l4_type for ttc_table
Replace matching on TCP and UDP protocols with new l4_type field which
is parsed by steering for ttc_table. It is enabled by the
outer_l4_type or inner_l4_type bits in nic_rx or port_sel flow table
capabilities and used only if pcc_ifa2 bit in HCA capabilities is set.

Signed-off-by: Jianbo Liu <jianbol@nvidia.com>
Reviewed-by: Mark Bloch <mbloch@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240402133043.56322-10-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:47:59 -07:00
Gal Pressman
8c54c89ad4 net/mlx5e: Add support for 800Gbps link modes
Add support for 800Gbps speed, link modes of 100Gbps per lane.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Cosmin Ratiu <cratiu@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Link: https://lore.kernel.org/r/20240402133043.56322-9-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:47:58 -07:00
Gal Pressman
30f8d23814 net/mlx5: Convert uintX_t to uX
In the kernel, the preferred types are uX.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-8-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:47:58 -07:00
Carolina Jubran
595f41608d net/mlx5e: XDP, Fix an inconsistent comment
Starting from commit
eb9b9fdcaf ("net/mlx5e: Introduce extended version for mlx5e_xmit_data")
sinfo is no longer passed as an argument to
mlx5e_xmit_xdp_frame(), the comment is inconsistent.

check_result must be zero when the packet is fragmented.

Signed-off-by: Carolina Jubran <cjubran@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-7-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:47:58 -07:00
Tariq Toukan
19b85f1b37 net/mlx5e: debugfs, Add reset option for command interface stats
Resetting stats just before some test/debug case allows us to eliminate
out the impact of previous commands. Useful in particular for the
average latency calculation.

The average_write() callback was unreachable, as "average" is a
read-only file. Extend, rename,  and use it for a newly exposed
write-only "reset" file.

Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Saeed Mahameed <saeedm@nvidia.com>
Reviewed-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-6-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:47:58 -07:00
Gal Pressman
27ea84ab35 net/mlx5e: Make stats group fill_stats callbacks consistent with the API
The fill_strings() callbacks were changed to accept a **data pointer,
and not rely on propagating the index value.
Make a similar change to fill_stats() callbacks to keep the API
consistent.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-5-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:47:58 -07:00
Gal Pressman
89b34322d2 net/mlx5e: Use ethtool_sprintf/puts() to fill stats strings
Use ethtool_sprintf/puts() helper functions which handle the common
pattern of printing a string into the ethtool strings interface and
incrementing the string pointer by ETH_GSTRING_LEN.

Change the fill_strings callback to accept a **data pointer, and remove
the index and return value.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-4-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:47:58 -07:00
Gal Pressman
9ac9299d41 net/mlx5e: Use ethtool_sprintf/puts() to fill selftests strings
Use ethtool_sprintf/puts() helper functions which handle the common
pattern of printing a string into the ethtool strings interface and
incrementing the string pointer by ETH_GSTRING_LEN.

The int return value in mlx5e_self_test_fill_strings() is not removed as
it is still used to return the number of selftests.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-3-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:47:58 -07:00
Gal Pressman
e2d515eb8f net/mlx5e: Use ethtool_sprintf/puts() to fill priv flags strings
Use ethtool_sprintf/puts() helper functions which handle the common
pattern of printing a string into the ethtool strings interface and
incrementing the string pointer by ETH_GSTRING_LEN.

Signed-off-by: Gal Pressman <gal@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Signed-off-by: Tariq Toukan <tariqt@nvidia.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240402133043.56322-2-tariqt@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:47:57 -07:00
Jakub Kicinski
8c73e8b595 wireless-next patches for v6.10
The first "new features" pull request for v6.10 with changes both in
 stack and in drivers. The big thing in this pull request is that
 wireless subsystem is now almost free of sparse warnings. There's only
 one warning left in ath11k which was introduced in v6.9-rc1 and will
 be fixed via the wireless tree.
 
 Realtek drivers continue to improve, now we have support for RTL8922AE
 and RTL8723CS devices. ath11k also has long waited support for P2P.
 
 This time we have a small conflict in iwlwifi as we didn't consider it
 as major enough to justify merging wireless tree to wireless-next. But
 Stephen has an example merge resolution which should help with fixing
 the conflict:
 
 https://lore.kernel.org/all/20240326100945.765b8caf@canb.auug.org.au/
 
 Major changes:
 
 rtw89
 
 * RTL8922AE Wi-Fi 7 PCI device support
 
 rtw88
 
 * RTL8723CS SDIO device support
 
 iwlwifi
 
 * don't support puncturing in 5 GHz
 
 * support monitor mode on passive channels
 
 * BZ-W device support
 
 * P2P with HE/EHT support
 
 ath11k
 
 * P2P support for QCA6390, WCN6855 and QCA2066
 -----BEGIN PGP SIGNATURE-----
 
 iQFFBAABCgAvFiEEiBjanGPFTz4PRfLobhckVSbrbZsFAmYNIqIRHGt2YWxvQGtl
 cm5lbC5vcmcACgkQbhckVSbrbZt8jAf9H+o91boD34/qVdI5LWEcFhVKEkHpNtwm
 Y1sTKNBEtN1Gs2zcljjO6PqN9N4v2+lA42KSpzP5M42FfpI2aATI2v8jYsKTXOl2
 YVwF+8pDiAsi0YtQTxIthygjzTpsePCfj8z0xJaKGm195T+fMm9UebYETrfxxOp/
 z5StsJIPI0twgSLKKUWvLpX4ESt0l0HLJY1ok99sk4Cj36EKn6b9LbBinDKr6GcQ
 mGNtPyq0j4l0kS5qae9BbXZUohO54o8wiFnApdwGfA7S/kLY7eUtwZy7T050b62P
 zbNafwZbIjrH7dNcGfe6Fdr7PjQYFeI5Nh7dXxqM2LJOQsYXU/tcWQ==
 =WrPE
 -----END PGP SIGNATURE-----

Merge tag 'wireless-next-2024-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next

Kalle Valo says:

====================
wireless-next patches for v6.10

The first "new features" pull request for v6.10 with changes both in
stack and in drivers. The big thing in this pull request is that
wireless subsystem is now almost free of sparse warnings. There's only
one warning left in ath11k which was introduced in v6.9-rc1 and will
be fixed via the wireless tree.

Realtek drivers continue to improve, now we have support for RTL8922AE
and RTL8723CS devices. ath11k also has long waited support for P2P.

This time we have a small conflict in iwlwifi, Stephen has an example
merge resolution which should help with fixing the conflict:

https://lore.kernel.org/all/20240326100945.765b8caf@canb.auug.org.au/

Major changes:

rtw89
 * RTL8922AE Wi-Fi 7 PCI device support

rtw88
 * RTL8723CS SDIO device support

iwlwifi
 * don't support puncturing in 5 GHz
 * support monitor mode on passive channels
 * BZ-W device support
 * P2P with HE/EHT support

ath11k
 * P2P support for QCA6390, WCN6855 and QCA2066

* tag 'wireless-next-2024-04-03' of git://git.kernel.org/pub/scm/linux/kernel/git/wireless/wireless-next: (122 commits)
  wifi: mt76: mt7915: workaround dubious x | !y warning
  wifi: mwl8k: Avoid -Wflex-array-member-not-at-end warnings
  wifi: ti: Avoid a hundred -Wflex-array-member-not-at-end warnings
  wifi: iwlwifi: mvm: fix check in iwl_mvm_sta_fw_id_mask
  net: rfkill: gpio: Convert to platform remove callback returning void
  wifi: mac80211: use kvcalloc() for codel vars
  wifi: iwlwifi: reconfigure TLC during HW restart
  wifi: iwlwifi: mvm: don't change BA sessions during restart
  wifi: iwlwifi: mvm: select STA mask only for active links
  wifi: iwlwifi: mvm: set wider BW OFDMA ignore correctly
  wifi: iwlwifi: Add support for LARI_CONFIG_CHANGE_CMD cmd v9
  wifi: iwlwifi: mvm: Declare HE/EHT capabilities support for P2P interfaces
  wifi: iwlwifi: mvm: Remove outdated comment
  wifi: iwlwifi: add support for BZ_W
  wifi: iwlwifi: Print a specific device name.
  wifi: iwlwifi: remove wrong CRF_IDs
  wifi: iwlwifi: remove devices that never came out
  wifi: iwlwifi: mvm: mark EMLSR disabled in cleanup iterator
  wifi: iwlwifi: mvm: fix active link counting during recovery
  wifi: iwlwifi: mvm: assign link STA ID lookups during restart
  ...
====================

Link: https://lore.kernel.org/r/20240403093625.CF515C433C7@smtp.kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:36:57 -07:00
Rahul Rameshbabu
ca3e10c4d8 tools: ynl: ethtool.py: Make tool invokable from any CWD
ethtool.py depends on yml files in a specific location of the linux kernel
tree. Using relative lookup for those files means that ethtool.py would
need to be run under tools/net/ynl/. Lookup needed yml files without
depending on the current working directory that ethtool.py is invoked from.

Signed-off-by: Rahul Rameshbabu <rrameshbabu@nvidia.com>
Reviewed-by: Dragos Tatulea <dtatulea@nvidia.com>
Link: https://lore.kernel.org/r/20240402204000.115081-1-rrameshbabu@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:34:08 -07:00
Pawel Dembicki
a9e4230d0b net: phy: marvell: implement cable-test for 88E308X/88E609X family
This commit implements VCT in 88E308X/88E609X Family.

It require two workarounds with some magic configuration.
Regular use require only one register configuration. But Open Circuit
require second workaround.
It cause implementation two phases for fault length measuring.

Fast Ethernet PHY have implemented very simple version of VCT. It's
complitley different than vct5 or vct7.

Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240402201123.2961909-3-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:33:20 -07:00
Pawel Dembicki
9cc8a6e626 net: ethtool: Add impedance mismatch result code to cable test
Some PHYs can recognize during a cable test if the impedance in the cable
is okay. They can detect reflections caused by impedance discontinuity
between a regular 100 Ohm cable and an abnormal part with a higher or
lower impedance.

This commit introduces a new result code:
ETHTOOL_A_CABLE_RESULT_CODE_IMPEDANCE_MISMATCH,
which represents the results of a cable test indicating issues with
impedance integrity.

Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20240402201123.2961909-2-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:33:20 -07:00
Pawel Dembicki
ada9841e3e net: phy: marvell: add basic support of 88E308X/88E609X family
This patch implements only basic support.

It covers PHY used in multiple IC:
PHY: 88E3082, 88E3083
Switch: 88E6096, 88E6097

Signed-off-by: Pawel Dembicki <paweldembicki@gmail.com>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Link: https://lore.kernel.org/r/20240402201123.2961909-1-paweldembicki@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:33:20 -07:00
Haiyang Zhang
c0de6ab920 net: mana: Fix Rx DMA datasize and skb_over_panic
mana_get_rxbuf_cfg() aligns the RX buffer's DMA datasize to be
multiple of 64. So a packet slightly bigger than mtu+14, say 1536,
can be received and cause skb_over_panic.

Sample dmesg:
[ 5325.237162] skbuff: skb_over_panic: text:ffffffffc043277a len:1536 put:1536 head:ff1100018b517000 data:ff1100018b517100 tail:0x700 end:0x6ea dev:<NULL>
[ 5325.243689] ------------[ cut here ]------------
[ 5325.245748] kernel BUG at net/core/skbuff.c:192!
[ 5325.247838] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[ 5325.258374] RIP: 0010:skb_panic+0x4f/0x60
[ 5325.302941] Call Trace:
[ 5325.304389]  <IRQ>
[ 5325.315794]  ? skb_panic+0x4f/0x60
[ 5325.317457]  ? asm_exc_invalid_op+0x1f/0x30
[ 5325.319490]  ? skb_panic+0x4f/0x60
[ 5325.321161]  skb_put+0x4e/0x50
[ 5325.322670]  mana_poll+0x6fa/0xb50 [mana]
[ 5325.324578]  __napi_poll+0x33/0x1e0
[ 5325.326328]  net_rx_action+0x12e/0x280

As discussed internally, this alignment is not necessary. To fix
this bug, remove it from the code. So oversized packets will be
marked as CQE_RX_TRUNCATED by NIC, and dropped.

Cc: stable@vger.kernel.org
Fixes: 2fbbd712ba ("net: mana: Enable RX path to handle various MTU sizes")
Signed-off-by: Haiyang Zhang <haiyangz@microsoft.com>
Reviewed-by: Dexuan Cui <decui@microsoft.com>
Link: https://lore.kernel.org/r/1712087316-20886-1-git-send-email-haiyangz@microsoft.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:32:03 -07:00
Christophe JAILLET
04af1d6437 net: fman: Remove some unused fields in some structure
In "struct muram_info", the 'size' field is unused.
In "struct memac_cfg", the 'fixed_link' field is unused.

Remove them.

Found with cppcheck, unusedStructMember.

Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Sean Anderson <sean.anderson@seco.com>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/425222d4f6c584e8316ccb7b2ef415a85c96e455.1712084103.git.christophe.jaillet@wanadoo.fr
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:30:28 -07:00
Eric Dumazet
7eb322360b net/sched: fix lockdep splat in qdisc_tree_reduce_backlog()
qdisc_tree_reduce_backlog() is called with the qdisc lock held,
not RTNL.

We must use qdisc_lookup_rcu() instead of qdisc_lookup()

syzbot reported:

WARNING: suspicious RCU usage
6.1.74-syzkaller #0 Not tainted
-----------------------------
net/sched/sch_api.c:305 suspicious rcu_dereference_protected() usage!

other info that might help us debug this:

rcu_scheduler_active = 2, debug_locks = 1
3 locks held by udevd/1142:
  #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline]
  #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline]
  #0: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: net_tx_action+0x64a/0x970 net/core/dev.c:5282
  #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: spin_lock include/linux/spinlock.h:350 [inline]
  #1: ffff888171861108 (&sch->q.lock){+.-.}-{2:2}, at: net_tx_action+0x754/0x970 net/core/dev.c:5297
  #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_lock_acquire include/linux/rcupdate.h:306 [inline]
  #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: rcu_read_lock include/linux/rcupdate.h:747 [inline]
  #2: ffffffff87c729a0 (rcu_read_lock){....}-{1:2}, at: qdisc_tree_reduce_backlog+0x84/0x580 net/sched/sch_api.c:792

stack backtrace:
CPU: 1 PID: 1142 Comm: udevd Not tainted 6.1.74-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call Trace:
 <TASK>
  [<ffffffff85b85f14>] __dump_stack lib/dump_stack.c:88 [inline]
  [<ffffffff85b85f14>] dump_stack_lvl+0x1b1/0x28f lib/dump_stack.c:106
  [<ffffffff85b86007>] dump_stack+0x15/0x1e lib/dump_stack.c:113
  [<ffffffff81802299>] lockdep_rcu_suspicious+0x1b9/0x260 kernel/locking/lockdep.c:6592
  [<ffffffff84f0054c>] qdisc_lookup+0xac/0x6f0 net/sched/sch_api.c:305
  [<ffffffff84f037c3>] qdisc_tree_reduce_backlog+0x243/0x580 net/sched/sch_api.c:811
  [<ffffffff84f5b78c>] pfifo_tail_enqueue+0x32c/0x4b0 net/sched/sch_fifo.c:51
  [<ffffffff84fbcf63>] qdisc_enqueue include/net/sch_generic.h:833 [inline]
  [<ffffffff84fbcf63>] netem_dequeue+0xeb3/0x15d0 net/sched/sch_netem.c:723
  [<ffffffff84eecab9>] dequeue_skb net/sched/sch_generic.c:292 [inline]
  [<ffffffff84eecab9>] qdisc_restart net/sched/sch_generic.c:397 [inline]
  [<ffffffff84eecab9>] __qdisc_run+0x249/0x1e60 net/sched/sch_generic.c:415
  [<ffffffff84d7aa96>] qdisc_run+0xd6/0x260 include/net/pkt_sched.h:125
  [<ffffffff84d85d29>] net_tx_action+0x7c9/0x970 net/core/dev.c:5313
  [<ffffffff85e002bd>] __do_softirq+0x2bd/0x9bd kernel/softirq.c:616
  [<ffffffff81568bca>] invoke_softirq kernel/softirq.c:447 [inline]
  [<ffffffff81568bca>] __irq_exit_rcu+0xca/0x230 kernel/softirq.c:700
  [<ffffffff81568ae9>] irq_exit_rcu+0x9/0x20 kernel/softirq.c:712
  [<ffffffff85b89f52>] sysvec_apic_timer_interrupt+0x42/0x90 arch/x86/kernel/apic/apic.c:1107
  [<ffffffff85c00ccb>] asm_sysvec_apic_timer_interrupt+0x1b/0x20 arch/x86/include/asm/idtentry.h:656

Fixes: d636fc5dd6 ("net: sched: add rcu annotations around qdisc->qdisc_sleeping")
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Jiri Pirko <jiri@nvidia.com>
Acked-by: Jamal Hadi Salim <jhs@mojatatu.com>
Link: https://lore.kernel.org/r/20240402134133.2352776-1-edumazet@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:29:42 -07:00
Jakub Kicinski
1b39b79d26 Merge branch 'af_unix-remove-old-gc-leftovers'
Kuniyuki Iwashima says:

====================
af_unix: Remove old GC leftovers.

This is a follow-up series for commit 4090fa373f ("af_unix: Replace
garbage collection algorithm.") which introduced the new GC for AF_UNIX.

Now we no longer need two ugly tricks for the old GC, let's remove them.
====================

Link: https://lore.kernel.org/r/20240401173125.92184-1-kuniyu@amazon.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-04-03 19:27:15 -07:00