The bridge driver gates the neighbor suppression code behind an internal
per-bridge flag called 'BROPT_NEIGH_SUPPRESS_ENABLED'. The flag is set
when at least one bridge port has neighbor suppression enabled.
As a preparation for per-{Port, VLAN} neighbor suppression, make sure
the global flag is also set if per-{Port, VLAN} neighbor suppression is
enabled. That is, when the 'BR_NEIGH_VLAN_SUPPRESS' flag is set on at
least one bridge port.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add two internal flags that will be used to enable / disable per-{Port,
VLAN} neighbor suppression:
1. 'BR_NEIGH_VLAN_SUPPRESS': A per-port flag used to indicate that
per-{Port, VLAN} neighbor suppression is enabled on the bridge port.
When set, 'BR_NEIGH_SUPPRESS' has no effect.
2. 'BR_VLFLAG_NEIGH_SUPPRESS_ENABLED': A per-VLAN flag used to indicate
that neighbor suppression is enabled on the given VLAN.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Subsequent patches are going to add per-{Port, VLAN} neighbor
suppression, which will require br_flood() to potentially suppress ARP /
NS packets on a per-{Port, VLAN} basis.
As a preparation, pass the VLAN ID of the packet as another argument to
br_flood().
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
The bridge does not flood ARP / NS packets for which a reply was sent to
bridge ports that have neighbor suppression enabled.
Subsequent patches are going to add per-{Port, VLAN} neighbor
suppression, which is going to make it more expensive to check whether
neighbor suppression is enabled since a VLAN lookup will be required.
Therefore, instead of unnecessarily performing this lookup for every
packet, only perform it for ARP / NS packets for which a reply was sent.
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
Emeel Hakim says:
====================
Support MACsec VLAN
This patch series introduces support for hardware (HW) offload MACsec
devices with VLAN configuration. The patches address both scenarios
where the VLAN header is both the inner and outer header for MACsec.
The changes include:
1. Adding MACsec offload operation for VLAN.
2. Considering VLAN when accessing MACsec net device.
3. Currently offloading MACsec when it's configured over VLAN with
current MACsec TX steering rules would wrongly insert the MACsec sec tag
after inserting the VLAN header. This resulted in an ETHERNET | SECTAG |
VLAN packet when ETHERNET | VLAN | SECTAG is configured. The patche
handles this issue when configuring steering rules.
4. Adding MACsec rx_handler change support in case of a marked skb and a
mismatch on the dst MAC address.
Please review these changes and let me know if you have any feedback or
concerns.
Updates since v1:
- Consult vlan_features when adding NETIF_F_HW_MACSEC.
- Allow grep for the functions.
- Add helper function to get the macsec operation to allow the compiler
to make some choice.
Updates since v2:
- Don't use macros to allow direct navigattion from mdo functions to its
implementation.
- Make the vlan_get_macsec_ops argument a const.
- Check if the specific mdo function is available before calling it.
- Enable NETIF_F_HW_MACSEC by default when the lower device has it enabled
and in case the lower device currently has NETIF_F_HW_MACSEC but disabled
let the new vlan device also have it disabled.
Updates since v3:
- Split patch ("vlan: Add MACsec offload operations for VLAN interface")
to prevent mixing generic vlan code changes with driver changes.
- Add mdo_open, stop and stats to support drivers which have those.
- Don't fail if macsec offload operations are available but a specific
function is not, to support drivers which does not implement all
macsec offload operations.
- Don't call find_rx_sc twice in the same loop, instead save the result
in a parameter and re-use it.
- Completely remove _BUILD_VLAN_MACSEC_MDO macro, to prevent returning
from a macro.
- Reorder the functions inside struct macsec_ops to match the struct
decleration.
Updates since v4:
- Change subject line of ("macsec: Add MACsec rx_handler change support") and adapt commit message.
- Don't separate the new check in patch ("macsec: Add MACsec rx_handler change support")
from the previous if/else if.
- Drop"_found" from the parameter naming "rx_sc_found" and move the definition to
the relevant block.
- Remove "{}" since not needed around a single line.
Updates since v5:
- Consider promiscuous mode case.
Updates since v6:
- Use IS_ENABLED instead of checking for ifdef.
- Don't add inline keywork in c files, let the compiler make its own decisions.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
Offloading device drivers will mark offloaded MACsec SKBs with the
corresponding SCI in the skb_metadata_dst so the macsec rx handler will
know to which interface to divert those skbs, in case of a marked skb
and a mismatch on the dst MAC address, divert the skb to the macsec
net_device where the macsec rx_handler will be called to consider cases
where relying solely on the dst MAC address is insufficient.
One such instance is when using MACsec with a VLAN as an inner
header, where the packet structure is ETHERNET | SECTAG | VLAN.
In such a scenario, the dst MAC address in the ethernet header
will correspond to the VLAN MAC address, resulting in a mismatch.
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Offloading MACsec when its configured over VLAN with current MACsec
TX steering rules will wrongly insert MACsec sec tag after inserting
the VLAN header leading to a ETHERNET | SECTAG | VLAN packet when
ETHERNET | VLAN | SECTAG is configured.
The above issue is due to adding the SECTAG by HW which is a later
stage compared to the VLAN header insertion stage.
Detect such a case and adjust TX steering rules to insert the
SECTAG in the correct place by using reformat_param_0 field in
the packet reformat to indicate the offset of SECTAG from end of
the MAC header to account for VLANs in granularity of 4Bytes.
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
MACsec device may have a VLAN device on top of it.
Detect MACsec state correctly under this condition,
and return the correct net device accordingly.
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Enable MACsec offload feature over VLAN by adding NETIF_F_HW_MACSEC
to the device vlan_features.
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add support for MACsec offload operations for VLAN driver
to allow offloading MACsec when VLAN's real device supports
Macsec offload by forwarding the offload request to it.
Signed-off-by: Emeel Hakim <ehakim@nvidia.com>
Reviewed-by: Subbaraya Sundeep <sbhatta@marvell.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Xin Long says:
====================
sctp: fix a plenty of flexible-array-nested warnings
Paolo noticed a compile warning in SCTP,
../net/sctp/stream_sched_fc.c: note: in included file (through ../include/net/sctp/sctp.h):
../include/net/sctp/structs.h:335:41: warning: array of flexible structures
But not only this, there are actually quite a lot of such warnings in
some SCTP structs. This patchset fixes most of warnings by deleting
these nested flexible array members.
After this patchset, there are still some warnings left:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
./include/net/sctp/structs.h:1145:41: warning: nested flexible array
./include/uapi/linux/sctp.h:641:34: warning: nested flexible array
./include/uapi/linux/sctp.h:643:34: warning: nested flexible array
./include/uapi/linux/sctp.h:644:33: warning: nested flexible array
./include/uapi/linux/sctp.h:650:40: warning: nested flexible array
./include/uapi/linux/sctp.h:653:39: warning: nested flexible array
the 1st is caused by __data[] in struct ip_options, not in SCTP;
the others are in uapi, and we should not touch them.
Note that instead of completely deleting it, we just leave it as a
comment in the struct, signalling to the reader that we do expect
such variable parameters over there, as Marcelo suggested.
====================
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch deletes the flexible-array payload[] from the structure
sctp_datahdr to avoid some sparse warnings:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
net/sctp/socket.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
./include/linux/sctp.h:230:29: warning: nested flexible array
This member is not even used anywhere.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch deletes the flexible-array hmac[] from the structure
sctp_authhdr to avoid some sparse warnings:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
net/sctp/auth.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
./include/linux/sctp.h:735:29: warning: nested flexible array
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch deletes the flexible-array peer_init[] from the structure
sctp_cookie to avoid some sparse warnings:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
net/sctp/sm_make_chunk.c: note: in included file (through include/net/sctp/sctp.h):
./include/net/sctp/structs.h:1588:28: warning: nested flexible array
./include/net/sctp/structs.h:343:28: warning: nested flexible array
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch deletes the flexible-array variable[] from the structure
sctp_sackhdr and sctp_errhdr to avoid some sparse warnings:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
net/sctp/sm_statefuns.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
./include/linux/sctp.h:451:28: warning: nested flexible array
./include/linux/sctp.h:393:29: warning: nested flexible array
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch deletes the flexible-array skip[] from the structure
sctp_ifwdtsn/fwdtsn_hdr to avoid some sparse warnings:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
net/sctp/stream_interleave.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
./include/linux/sctp.h:611:32: warning: nested flexible array
./include/linux/sctp.h:628:33: warning: nested flexible array
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This patch deletes the flexible-array params[] from the structure
sctp_inithdr, sctp_addiphdr and sctp_reconf_chunk to avoid some
sparse warnings:
# make C=2 CF="-Wflexible-array-nested" M=./net/sctp/
net/sctp/input.c: note: in included file (through include/net/sctp/structs.h, include/net/sctp/sctp.h):
./include/linux/sctp.h:278:29: warning: nested flexible array
./include/linux/sctp.h:675:30: warning: nested flexible array
This warning is reported if a structure having a flexible array
member is included by other structures.
Signed-off-by: Xin Long <lucien.xin@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Johannes Berg says:
====================
net: extend drop reasons
Here's v4 of the extended drop reasons, with fixes to kernel-doc
and checkpatch.
====================
Link: https://lore.kernel.org/r/20230419125254.20789-1-johannes@sipsolutions.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
It can be really hard to analyse or debug why packets are
going missing in mac80211, so add the needed infrastructure
to use use the new per-subsystem drop reasons.
We actually use two drop reason subsystems here because of
the different handling of frames that are dropped but still
go to monitor for old versions of hostapd, and those that
are just completely unusable (e.g. crypto failed.)
Annotate a few reasons here just to illustrate this, we'll
need to go through and annotate more of them later.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Extend drop reasons to make them usable by subsystems
other than core by reserving the high 16 bits for a
new subsystem ID, of which 0 of course is used for the
existing reasons immediately.
To still be able to have string reasons, restructure
that code a bit to make the loopup under RCU, the only
user of this (right now) is drop_monitor.
Link: https://lore.kernel.org/netdev/00659771ed54353f92027702c5bbb84702da62ce.camel@sipsolutions.net
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This will, after the next patch, hold only the core
drop reasons and minimal infrastructure. Fix a small
kernel-doc issue while at it, to avoid the move
triggering a checker.
Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
ICMPv6 error packets are not sent to the anycast destinations and this
prevents things like traceroute from working. So create a setting similar
to ECHO when dealing with Anycast sources (icmpv6_echo_ignore_anycast).
Signed-off-by: Mahesh Bandewar <maheshb@google.com>
Reviewed-by: David Ahern <dsahern@kernel.org>
Reviewed-by: Maciej Żenczykowski <maze@google.com>
Link: https://lore.kernel.org/r/20230419013238.2691167-1-maheshb@google.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Vladimir Oltean says:
====================
ethtool mm API consolidation
This series consolidates the behavior of the 2 drivers that implement
the ethtool MAC Merge layer by making NXP ENETC commit its preemptible
traffic classes to hardware only when MM TX is active (same as Ocelot).
Then, after resolving an issue with the ENETC driver, it restricts user
space from entering 2 states which don't make sense:
- pmac-enabled off tx-enabled on verify-enabled *
- pmac-enabled * tx-enabled off verify-enabled on
Then, it introduces a selftest (ethtool_mm.sh) which puts everything
together and tests all valid configurations known to me.
This is simultaneously the v2 of "[PATCH net-next 0/2] ethtool mm API
improvements":
https://lore.kernel.org/netdev/20230415173454.3970647-1-vladimir.oltean@nxp.com/
which had caused some problems to openlldp. Those were solved in the
meantime, see:
11171b474f
and of "[RFC PATCH net-next] selftests: forwarding: add a test for MAC
Merge layer":
https://lore.kernel.org/netdev/20230210221243.228932-1-vladimir.oltean@nxp.com/
====================
Link: https://lore.kernel.org/r/20230418111459.811553-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The MAC Merge layer (IEEE 802.3-2018 clause 99) does all the heavy
lifting for Frame Preemption (IEEE 802.1Q-2018 clause 6.7.2), a TSN
feature for minimizing latency.
Preemptible traffic is different on the wire from normal traffic in
incompatible ways. If we send a preemptible packet and the link partner
doesn't support preemption, it will drop it as an error frame and we
will never know. The MAC Merge layer has a control plane of its own,
which can be manipulated (using ethtool) in order to negotiate this
capability with the link partner (through LLDP).
Actually the TLV format for LLDP solves this problem only partly,
because both partners only advertise:
- if they support preemption (RX and TX)
- if they have enabled preemption (TX)
so we cannot tell the link partner what to do - we cannot force it to
enable reception of our preemptible packets.
That is fully solved by the verification feature, where the local device
generates some small probe frames which look like preemptible frames
with no useful content, and the link partner is obliged to respond to
them if it supports the standard. If the verification times out, we know
that preemption isn't active in our TX direction on the link.
Having clarified the definition, this selftest exercises the manual
(ethtool) configuration path of 2 link partners (with and without
verification), and the LLDP code path, using the openlldp project.
The test also verifies the TX activity of the MAC Merge layer by
sending traffic through a traffic class configured as preemptible
(using mqprio). There isn't a good way to make this really portable
(user space cannot find out how many traffic classes there are for
a device), but I chose num_tc 4 here, that should work reasonably well.
I also know that some devices (stmmac) only permit TXQ0 to be
preemptible, so this is why PREEMPTIBLE_PRIO was strategically chosen
as 0. Even if other hardware is more configurable, this test should
cover the baseline.
This is not really a "forwarding" selftest, but I put it near the other
"ethtool" selftests.
$ ./ethtool_mm.sh eno0 swp0
TEST: Manual configuration with verification: eno0 to swp0 [ OK ]
TEST: Manual configuration with verification: swp0 to eno0 [ OK ]
TEST: Manual configuration without verification: eno0 to swp0 [ OK ]
TEST: Manual configuration without verification: swp0 to eno0 [ OK ]
TEST: Manual configuration with failed verification: eno0 to swp0 [ OK ]
TEST: Manual configuration with failed verification: swp0 to eno0 [ OK ]
TEST: LLDP [ OK ]
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Counters for the MAC Merge layer and preemptible MAC have standardized
so far on using structured ethtool stats as opposed to the driver
specific names and meanings.
Benefit from that rare opportunity and introduce a helper to lib.sh for
querying standardized counters, in the hope that these will take off for
other uses as well.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
mlxsw selftests often invoke a bail_on_lldpad() helper to make sure LLDPAD
is not running, to prevent conflicts between the QoS configuration applied
through TC or DCB command line tool, and the DCB configuration that LLDPAD
might apply. This helper might be useful to others. Move the function to
lib.sh, and parameterize to make reusable in other contexts.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The driver-specific wrappers of these selftests invoke bail_on_lldpad to
make sure that LLDPAD doesn't trample the configuration. The function
bail_on_lldpad is going to move to lib.sh in the next patch. With that, it
won't be visible for the wrappers before sourcing the framework script. And
after sourcing it, it is too late: the selftest will have run by then.
One option might be to source NUM_NETIFS=0 lib.sh from the wrapper, but
even if that worked (it might, it might not), that seems cumbersome. lib.sh
is doing fair amount of stuff, and even if it works today, it does not look
particularly solid as a solution.
Instead, introduce a hook, sch_tbf_pre_hook(), that when available, gets
invoked. Move the bail to the hook.
Signed-off-by: Petr Machata <petrm@nvidia.com>
Reviewed-by: Danielle Ratson <danieller@nvidia.com>
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The verify-enabled boolean (ETHTOOL_A_MM_VERIFY_ENABLED) was intended to
be a sub-setting of tx-enabled (ETHTOOL_A_MM_TX_ENABLED). IOW, MAC Merge
TX can be enabled with or without verification, but verification with TX
disabled makes no sense.
The pmac-enabled boolean (ETHTOOL_A_MM_PMAC_ENABLED) was intended to be
a global toggle from an API perspective, whereas tx-enabled just handles
the TX direction. IOW, the pMAC can be enabled with or without TX, but
it doesn't make sense to enable TX if the pMAC is not enabled.
Add two checks which sanitize and reject these invalid cases.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
These have been useful in debugging various problems related to frame
preemption, so make them available through ethtool --register-dump for
later too.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This was left as TODO in commit 01e23b2b3bad ("net: enetc: add support
for preemptible traffic classes") since it's relatively complicated.
Where this makes a difference is with a configuration as follows:
ethtool --set-mm eno0 pmac-enabled on tx-enabled on verify-enabled on
Preemptible packets should only be sent when the MAC Merge TX direction
becomes active (i.o.w. when the verification process succeeds, aka when
the link partner confirms it can process preemptible traffic). But the
tc qdisc with the preemptible traffic classes is offloaded completely
asynchronously w.r.t. the MM becoming active.
The ENETC manual does suggest that this should be handled in the driver:
"On startup, software should wait for the verification process to
complete (MMCSR[VSTS]=011) before initiating traffic".
Adding the necessary logic allows future selftests to uphold the claim
that an inactive or disabled MAC Merge layer should never send data
packets through the pMAC.
This change moves enetc_set_ptcfpr() from enetc.c to enetc_ethtool.c,
where its only caller is now - enetc_mm_commit_preemptible_tcs().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The MMCSR register contains 2 fields with overlapping meaning:
- LPA (Local preemption active):
This read-only status bit indicates whether preemption is active for
this port. This bit will be set if preemption is both enabled and has
completed the verification process.
- TXSTS (Merge status):
This read-only status field provides the state of the MAC Merge sublayer
transmit status as defined in IEEE Std 802.3-2018 Clause 99.
00 Transmit preemption is inactive
01 Transmit preemption is active
10 Reserved
11 Reserved
However none of these 2 fields offer reliable reporting to software.
When connecting ENETC to a link partner which is not capable of Frame
Preemption, the expectation is that ENETC's verification should fail
(VSTS=4) and its MM TX direction should be inactive (LPA=0, TXSTS=00)
even though the MM TX is enabled (ME=1). But surprise, the LPA bit of
MMCSR stays set even if VSTS=4 and ME=1.
OTOH, the TXSTS field has the opposite problem. I cannot get its value
to change from 0, even when connecting to a link partner capable of
frame preemption, which does respond to its verification frames (ME=1
and VSTS=3, "SUCCEEDED").
The only option with such buggy hardware seems to be to reimplement the
formula for calculating tx-active in software, which is for tx-enabled
to be true, and for the verify-status to be either SUCCEEDED, or
DISABLED.
Without reliable tx-active reporting, we have no good indication when
to commit the preemptible traffic classes to hardware, which makes it
possible (but not desirable) to send preemptible traffic to a link
partner incapable of receiving it. However, currently we do not have the
logic to wait for TX to be active yet, so the impact is limited.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Current enetc_set_mm() is designed to set the priv->active_offloads bit
ENETC_F_QBU for enetc_mm_link_state_update() to act on, but if the link
is already up, it modifies the ENETC_MMCSR_ME ("Merge Enable") bit
directly.
The problem is that it only *sets* ENETC_MMCSR_ME if the link is up, it
doesn't *clear* it if needed. So subsequent enetc_get_mm() calls still
see tx-enabled as true, up until a link down event, which is when
enetc_mm_link_state_update() will get called.
This is not a functional issue as far as I can assess. It has only come
up because I'd like to uphold a simple API rule in core ethtool code:
the pMAC cannot be disabled if TX is going to be enabled. Currently,
the fact that TX remains enabled for longer than expected (after the
enetc_set_mm() call that disables it) is going to violate that rule,
which is how it was caught.
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Reviewed-by: Simon Horman <simon.horman@corigine.com>
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Refer to USB serial device or net device, there is a notice to
let end user know the status of device, like attached or
disconnected. Add attach/disconnect print for wwan device as
well.
Signed-off-by: Slark Xiao <slark_xiao@163.com>
Reviewed-by: Loic Poulain <loic.poulain@linaro.org>
Link: https://lore.kernel.org/r/20230420023617.3919569-1-slark_xiao@163.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
__kfree_skb_defer() uses the old naming where "defer" meant
slab bulk free/alloc APIs. In the meantime we also made
__kfree_skb_defer() feed the per-NAPI skb cache, which
implies bulk APIs. So take away the 'defer' and add 'napi'.
While at it add a drop reason. This only matters on the
tx_action path, if the skb has a frag_list. But getting
rid of a SKB_DROP_REASON_NOT_SPECIFIED seems like a net
benefit so why not.
Reviewed-by: Alexander Lobakin <aleksander.lobakin@intel.com>
Link: https://lore.kernel.org/r/20230420020005.815854-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Address a number of warnings flagged by
./scripts/kernel-doc -none include/net/flow_dissector.h
include/net/flow_dissector.h:23: warning: Function parameter or member 'addr_type' not described in 'flow_dissector_key_control'
include/net/flow_dissector.h:23: warning: Function parameter or member 'flags' not described in 'flow_dissector_key_control'
include/net/flow_dissector.h:46: warning: Function parameter or member 'padding' not described in 'flow_dissector_key_basic'
include/net/flow_dissector.h:145: warning: Function parameter or member 'tipckey' not described in 'flow_dissector_key_addrs'
include/net/flow_dissector.h:157: warning: cannot understand function prototype: 'struct flow_dissector_key_arp '
include/net/flow_dissector.h:171: warning: cannot understand function prototype: 'struct flow_dissector_key_ports '
include/net/flow_dissector.h:203: warning: cannot understand function prototype: 'struct flow_dissector_key_icmp '
Also improve indentation on adjacent lines to those changed
to address the above.
No functional changes intended.
Signed-off-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20230419-flow-dissector-kdoc-v1-1-1aa0cca1118b@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Jesper points out that we must prevent recycling into cache
after page_pool_destroy() is called, because page_pool_destroy()
is not synchronized with recycling (some pages may still be
outstanding when destroy() gets called).
I assumed this will not happen because NAPI can't be scheduled
if its page pool is being destroyed. But I missed the fact that
NAPI may get reused. For instance when user changes ring configuration
driver may allocate a new page pool, stop NAPI, swap, start NAPI,
and then destroy the old pool. The NAPI is running so old page
pool will think it can recycle to the cache, but the consumer
at that point is the destroy() path, not NAPI.
To avoid extra synchronization let the drivers do "unlinking"
during the "swap" stage while NAPI is indeed disabled.
Fixes: 8c48eea3adf3 ("page_pool: allow caching from safely localized NAPI")
Reported-by: Jesper Dangaard Brouer <jbrouer@redhat.com>
Link: https://lore.kernel.org/all/e8df2654-6a5b-3c92-489d-2fe5e444135f@redhat.com/
Acked-by: Jesper Dangaard Brouer <brouer@redhat.com>
Link: https://lore.kernel.org/r/20230419182006.719923-1-kuba@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
The CONFIG_PHYLIB symbol is selected by a number of device drivers that
need PHY support, but it now has a dependency on CONFIG_LEDS_CLASS,
which may not be enabled, causing build failures.
Avoid the risk of missing and circular dependencies by guarding the
phylib LED support itself in another Kconfig symbol that can only be
enabled if the dependency is met.
This could be made a hidden symbol and always enabled when both CONFIG_OF
and CONFIG_LEDS_CLASS are reachable from the phylib, but there may be an
advantage in having users see this option when they have a misconfigured
kernel without built-in LED support.
Fixes: 01e5b728e9e4 ("net: phy: Add a binding for PHY LEDs")
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Andrew Lunn <andrew@lunn.ch>
Link: https://lore.kernel.org/r/20230420084624.3005701-1-arnd@kernel.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
To be consistent with the other enum keys use OP_MOD
instead of OP_MODE.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Remove unused function which also seems a duplicate
of esw_port_metadata_set().
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
The passded esw->dev is redundant as esw being passed and esw->dev being
used inside.
Signed-off-by: Roi Dayan <roid@nvidia.com>
Reviewed-by: Maor Dickman <maord@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Linking the NAPI to the rq page_pool to improve page_pool cache
usage during skb recycling.
Here are the observed improvements for a iperf single stream
test case:
- For 1500 MTU and legacy rq, seeing a 20% improvement of cache usage.
- For 9K MTU, seeing 33-40 % page_pool cache usage improvements for
both striding and legacy rq (depending if the application is running on
the same core as the rq or not).
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
When the XDP handler marks the data for transmission (XDP_TX),
it is incorrect to release the page fragment. Instead, the
fragments should be marked as MLX5E_WQE_FRAG_SKIP_RELEASE
because XDP will release the page directly to the page_pool
(page_pool_put_defragged_page) after TX completion.
The linear case already does this. This patch fixes the
nonlinear part as well.
Also, the looping over the fragments was incorrect: When handling
pages after XDP_TX in the legacy rq nonlinear case, the loop was
skipping the first wqe fragment.
Fixes: 3f93f82988bc ("net/mlx5e: RX, Defer page release in legacy rq for better recycling")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
mlx5e_free_rx_descs is responsible for calling the dealloc_wqe op which
returns pages to the page_pool. This can happen during flush or close.
For XSK, the regular RQ is flushed (when replaced by the XSK RQ) and
also closed later. This is normally not a problem as the wqe list is
empty on a second call to mlx5e_free_rx_descs. However, for striding RQ,
the previously released wqes from the list will appear as missing
and will be released a second time by mlx5e_free_rx_missing_descs.
This patch sets the no release bits on the striding RQ wqes in the
dealloc_wqe op to prevent releasing the pages a second time.
Please note that the bits are set only in the control path during
close and not in the data path.
Fixes: 4c2a13236807 ("net/mlx5e: RX, Defer page release in striding rq for better recycling")
Signed-off-by: Dragos Tatulea <dtatulea@nvidia.com>
Reviewed-by: Tariq Toukan <tariqt@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Create a new devlink health reporter for representor interface, which
reports the values of representor vnic diagnostic counters when diagnosed.
This patch will allow admins to monitor VF diagnostic counters through
the representor-interface vnic reporter.
Example of usage:
$ devlink health diagnose pci/0000:08:00.0/65537 reporter vnic
vNIC env counters:
total_error_queues: 0 send_queue_priority_update_flow: 0
comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
invalid_command: 0 quota_exceeded_command: 0
nic_receive_steering_discard: 0
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Create a vnic devlink health reporter for PFs/VFs interfaces.
The reporter's diagnose callback displays the values of vNIC/vport
transport debug counters of PFs/VFs, as follows:
$ devlink health diagnose pci/0000:08:00.0 reporter vnic
vNIC env counters:
total_error_queues: 0 send_queue_priority_update_flow: 0
comp_eq_overrun: 0 async_eq_overrun: 0 cq_overrun: 0
invalid_command: 0 quota_exceeded_command: 0
nic_receive_steering_discard: 0
Moreover, add documentation on the reporter functionality and the
counters description.
While at it, expose the vNIC counters diagnose function to be used by
the downstream patch, which will reveal the counters for representor
interfaces.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Reviewed-by: Moshe Shemesh <moshe@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
This reverts commit 606e6a72e29dff9e3341c4cc9b554420e4793f401 which exposes
the vnic diagnostic counters via debugfs. Instead, The upcoming series will
expose the same counters through devlink health reporter.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
This reverts commit 4fe1b3a5f8fe2fdcedcaba9561e5b0ae5cb1d15b, which
exposes the steering dropped packets counter via debugfs. The upcoming
series will expose the counter via devlink health reporter instead
of debugfs.
Signed-off-by: Maher Sanalla <msanalla@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>
Add counters for number of buddies that are currently in use per domain
per buddy type (STE, MODIFY-HEADER, MODIFY-PATTERN).
Signed-off-by: Erez Shitrit <erezsh@nvidia.com>
Signed-off-by: Yevgeny Kliteynik <kliteyn@nvidia.com>
Reviewed-by: Alex Vesker <valex@nvidia.com>
Signed-off-by: Saeed Mahameed <saeedm@nvidia.com>