net: dsa: restore dsa_software_vlan_untag() ability to operate on VLAN-untagged traffic

Robert Hodaszi reports that locally terminated traffic towards
VLAN-unaware bridge ports is broken with ocelot-8021q. He is describing
the same symptoms as for commit 1f9fc48fd3 ("net: dsa: sja1105: fix
reception from VLAN-unaware bridges").

For context, the set merged as "VLAN fixes for Ocelot driver":
https://lore.kernel.org/netdev/20240815000707.2006121-1-vladimir.oltean@nxp.com/

was developed in a slightly different form earlier this year, in January.
Initially, the switch was unconditionally configured to set OCELOT_ES0_TAG
when using ocelot-8021q, regardless of port operating mode.

This led to the situation where VLAN-unaware bridge ports would always
push their PVID - see ocelot_vlan_unaware_pvid() - a negligible value
anyway - into RX packets. To strip this in software, we would have needed
DSA to know what private VID the switch chose for VLAN-unaware bridge
ports, and pushed into the packets. This was implemented downstream, and
a remnant of it remains in the form of a comment mentioning
ds->ops->get_private_vid(), as something which would maybe need to be
considered in the future.

However, for upstream, it was deemed inappropriate, because it would
mean introducing yet another behavior for stripping VLAN tags from
VLAN-unaware bridge ports, when one already existed (ds->untag_bridge_pvid).
The latter has been marked as obsolete along with an explanation why it
is logically broken, but still, it would have been confusing.

So, for upstream, felix_update_tag_8021q_rx_rule() was developed, which
essentially changed the state of affairs from "Felix with ocelot-8021q
delivers all packets as VLAN-tagged towards the CPU" into "Felix with
ocelot-8021q delivers all packets from VLAN-aware bridge ports towards
the CPU". This was done on the premise that in VLAN-unaware mode,
there's nothing useful in the VLAN tags, and we can avoid introducing
ds->ops->get_private_vid() in the DSA receive path if we configure the
switch to not push those VLAN tags into packets in the first place.

Unfortunately, and this is when the trainwreck started, the selftests
developed initially and posted with the series were not re-ran.
dsa_software_vlan_untag() was initially written given the assumption
that users of this feature would send _all_ traffic as VLAN-tagged.
It was only partially adapted to the new scheme, by removing
ds->ops->get_private_vid(), which also used to be necessary in
standalone ports mode.

Where the trainwreck became even worse is that I had a second opportunity
to think about this, when the dsa_software_vlan_untag() logic change
initially broke sja1105, in commit 1f9fc48fd3 ("net: dsa: sja1105: fix
reception from VLAN-unaware bridges"). I did not connect the dots that
it also breaks ocelot-8021q, for pretty much the same reason that not
all received packets will be VLAN-tagged.

To be compatible with the optimized Felix control path which runs
felix_update_tag_8021q_rx_rule() to only push VLAN tags when useful (in
VLAN-aware mode), we need to restore the old dsa_software_vlan_untag()
logic. The blamed commit introduced the assumption that
dsa_software_vlan_untag() will see only VLAN-tagged packets, assumption
which is false. What corrupts RX traffic is the fact that we call
skb_vlan_untag() on packets which are not VLAN-tagged in the first
place.

Fixes: 93e4649efa ("net: dsa: provide a software untagging function on RX for VLAN-aware bridges")
Reported-by: Robert Hodaszi <robert.hodaszi@digi.com>
Closes: https://lore.kernel.org/netdev/20241215163334.615427-1-robert.hodaszi@digi.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241216135059.1258266-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
This commit is contained in:
Vladimir Oltean 2024-12-16 15:50:59 +02:00 committed by Jakub Kicinski
parent a713c017ef
commit 16f027cd40

View File

@ -138,9 +138,10 @@ static inline void dsa_software_untag_vlan_unaware_bridge(struct sk_buff *skb,
* dsa_software_vlan_untag: Software VLAN untagging in DSA receive path
* @skb: Pointer to socket buffer (packet)
*
* Receive path method for switches which cannot avoid tagging all packets
* towards the CPU port. Called when ds->untag_bridge_pvid (legacy) or
* ds->untag_vlan_aware_bridge_pvid is set to true.
* Receive path method for switches which send some packets as VLAN-tagged
* towards the CPU port (generally from VLAN-aware bridge ports) even when the
* packet was not tagged on the wire. Called when ds->untag_bridge_pvid
* (legacy) or ds->untag_vlan_aware_bridge_pvid is set to true.
*
* As a side effect of this method, any VLAN tag from the skb head is moved
* to hwaccel.
@ -149,14 +150,19 @@ static inline struct sk_buff *dsa_software_vlan_untag(struct sk_buff *skb)
{
struct dsa_port *dp = dsa_user_to_port(skb->dev);
struct net_device *br = dsa_port_bridge_dev_get(dp);
u16 vid;
u16 vid, proto;
int err;
/* software untagging for standalone ports not yet necessary */
if (!br)
return skb;
err = br_vlan_get_proto(br, &proto);
if (err)
return skb;
/* Move VLAN tag from data to hwaccel */
if (!skb_vlan_tag_present(skb)) {
if (!skb_vlan_tag_present(skb) && skb->protocol == htons(proto)) {
skb = skb_vlan_untag(skb);
if (!skb)
return NULL;