Networking fixes for 5.16-rc3, including fixes from netfilter.

Current release - regressions:
 
  - r8169: fix incorrect mac address assignment
 
  - vlan: fix underflow for the real_dev refcnt when vlan creation fails
 
  - smc: avoid warning of possible recursive locking
 
 Current release - new code bugs:
 
  - vsock/virtio: suppress used length validation
 
  - neigh: fix crash in v6 module initialization error path
 
 Previous releases - regressions:
 
  - af_unix: fix change in behavior in read after shutdown
 
  - igb: fix netpoll exit with traffic, avoid warning
 
  - tls: fix splice_read() when starting mid-record
 
  - lan743x: fix deadlock in lan743x_phy_link_status_change()
 
  - marvell: prestera: fix bridge port operation
 
 Previous releases - always broken:
 
  - tcp_cubic: fix spurious Hystart ACK train detections for
    not-cwnd-limited flows
 
  - nexthop: fix refcount issues when replacing IPv6 groups
 
  - nexthop: fix null pointer dereference when IPv6 is not enabled
 
  - phylink: force link down and retrigger resolve on interface change
 
  - mptcp: fix delack timer length calculation and incorrect early
    clearing
 
  - ieee802154: handle iftypes as u32, prevent shift-out-of-bounds
 
  - nfc: virtual_ncidev: change default device permissions
 
  - netfilter: ctnetlink: fix error codes and flags used for kernel side
    filtering of dumps
 
  - netfilter: flowtable: fix IPv6 tunnel addr match
 
  - ncsi: align payload to 32-bit to fix dropped packets
 
  - iavf: fix deadlock and loss of config during VF interface reset
 
  - ice: avoid bpf_prog refcount underflow
 
  - ocelot: fix broken PTP over IP and PTP API violations
 
 Misc:
 
  - marvell: mvpp2: increase MTU limit when XDP enabled
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmGhSCwACgkQMUZtbf5S
 IrvAdw//aR54mgc9rc0mkvS5sbDeKzDscmTzav5ANGjl+2ooKTOe8Qd07s59z6TJ
 H9IJlTu0Uc9Psbb2RvRo1T1HDohSpWy7SEN/Qlo6N+z1WzDHWbuXyC/KTQDM+8I1
 coMYBBTwBGkblBosuoMUi60GWLbBslLv9gR7HUZj7gbtxMfk36BrX5UYz1ONy+tx
 HiVshtOmzOgumBi+/j0tkI4lpI/ajf9eYaG6Vvd0A6F3idcbhWKNKfLPgw9qQF36
 sQrbz1SYwL5Ucgk47EG+Lpk7oSzbkdNoO6Ro9ncsebB8OMoLUhddclmG/fbgPG0o
 SWJ4kK3kmaRSTvSi6q4e5BM89oIhtFWhGRB6vURokrAQU1Ds+sq5F+8IwCaMqEYb
 GNyEZ8cdJhLc50RU+/Im3lN6IrRHvQiirE1BN+ZuCMjeSTrsqX18ZYMh1pSJhxkZ
 wRC03sSd2ZcaooFrSNJ5Scr3ndacrWNtVr78IQYCNrTjqJn1QUK7ZegTjP04FUfD
 JLB7+en8Hd6EKosJLKyoAPRwoFPZN6mDAPC6RfF45B3OoZAHbvXmJrOT6PatcqHe
 i0YwDkAJKPRijfcepN1IQYlY2Za5HwNWzCV6v0bf4tUCluDsSkczTKS02dZ1hegR
 oYW1Ra1BIyYK4cbG4H0lD7iBQGLGgwt38U1NlFpawbJa/fECUSs=
 =LtZ7
 -----END PGP SIGNATURE-----

Merge tag 'net-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Jakub Kicinski:
 "Networking fixes, including fixes from netfilter.

  Current release - regressions:

   - r8169: fix incorrect mac address assignment

   - vlan: fix underflow for the real_dev refcnt when vlan creation
     fails

   - smc: avoid warning of possible recursive locking

  Current release - new code bugs:

   - vsock/virtio: suppress used length validation

   - neigh: fix crash in v6 module initialization error path

  Previous releases - regressions:

   - af_unix: fix change in behavior in read after shutdown

   - igb: fix netpoll exit with traffic, avoid warning

   - tls: fix splice_read() when starting mid-record

   - lan743x: fix deadlock in lan743x_phy_link_status_change()

   - marvell: prestera: fix bridge port operation

  Previous releases - always broken:

   - tcp_cubic: fix spurious Hystart ACK train detections for
     not-cwnd-limited flows

   - nexthop: fix refcount issues when replacing IPv6 groups

   - nexthop: fix null pointer dereference when IPv6 is not enabled

   - phylink: force link down and retrigger resolve on interface change

   - mptcp: fix delack timer length calculation and incorrect early
     clearing

   - ieee802154: handle iftypes as u32, prevent shift-out-of-bounds

   - nfc: virtual_ncidev: change default device permissions

   - netfilter: ctnetlink: fix error codes and flags used for kernel
     side filtering of dumps

   - netfilter: flowtable: fix IPv6 tunnel addr match

   - ncsi: align payload to 32-bit to fix dropped packets

   - iavf: fix deadlock and loss of config during VF interface reset

   - ice: avoid bpf_prog refcount underflow

   - ocelot: fix broken PTP over IP and PTP API violations

  Misc:

   - marvell: mvpp2: increase MTU limit when XDP enabled"

* tag 'net-5.16-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (94 commits)
  net: dsa: microchip: implement multi-bridge support
  net: mscc: ocelot: correctly report the timestamping RX filters in ethtool
  net: mscc: ocelot: set up traps for PTP packets
  net: ptp: add a definition for the UDP port for IEEE 1588 general messages
  net: mscc: ocelot: create a function that replaces an existing VCAP filter
  net: mscc: ocelot: don't downgrade timestamping RX filters in SIOCSHWTSTAMP
  net: hns3: fix incorrect components info of ethtool --reset command
  net: hns3: fix one incorrect value of page pool info when queried by debugfs
  net: hns3: add check NULL address for page pool
  net: hns3: fix VF RSS failed problem after PF enable multi-TCs
  net: qed: fix the array may be out of bound
  net/smc: Don't call clcsock shutdown twice when smc shutdown
  net: vlan: fix underflow for the real_dev refcnt
  ptp: fix filter names in the documentation
  ethtool: ioctl: fix potential NULL deref in ethtool_set_coalesce()
  nfc: virtual_ncidev: change default device permissions
  net/sched: sch_ets: don't peek at classes beyond 'nbands'
  net: stmmac: Disable Tx queues when reconfiguring the interface
  selftests: tls: test for correct proto_ops
  tls: fix replacing proto_ops
  ...
This commit is contained in:
Linus Torvalds 2021-11-26 12:58:53 -08:00
commit c5c17547b7
89 changed files with 1965 additions and 668 deletions

View File

@ -37,8 +37,7 @@ conn_reuse_mode - INTEGER
0: disable any special handling on port reuse. The new
connection will be delivered to the same real server that was
servicing the previous connection. This will effectively
disable expire_nodest_conn.
servicing the previous connection.
bit 1: enable rescheduling of new connections when it is safe.
That is, whenever expire_nodest_conn and for TCP sockets, when

View File

@ -486,8 +486,8 @@ of packets.
Drivers are free to use a more permissive configuration than the requested
configuration. It is expected that drivers should only implement directly the
most generic mode that can be supported. For example if the hardware can
support HWTSTAMP_FILTER_V2_EVENT, then it should generally always upscale
HWTSTAMP_FILTER_V2_L2_SYNC_MESSAGE, and so forth, as HWTSTAMP_FILTER_V2_EVENT
support HWTSTAMP_FILTER_PTP_V2_EVENT, then it should generally always upscale
HWTSTAMP_FILTER_PTP_V2_L2_SYNC, and so forth, as HWTSTAMP_FILTER_PTP_V2_EVENT
is more generic (and more useful to applications).
A driver which supports hardware time stamping shall update the struct

View File

@ -3580,13 +3580,14 @@ L: netdev@vger.kernel.org
S: Supported
F: drivers/net/ethernet/broadcom/b44.*
BROADCOM B53 ETHERNET SWITCH DRIVER
BROADCOM B53/SF2 ETHERNET SWITCH DRIVER
M: Florian Fainelli <f.fainelli@gmail.com>
L: netdev@vger.kernel.org
L: openwrt-devel@lists.openwrt.org (subscribers-only)
S: Supported
F: Documentation/devicetree/bindings/net/dsa/brcm,b53.yaml
F: drivers/net/dsa/b53/*
F: drivers/net/dsa/bcm_sf2*
F: include/linux/dsa/brcm.h
F: include/linux/platform_data/b53.h
@ -18493,6 +18494,7 @@ F: include/uapi/linux/pkt_sched.h
F: include/uapi/linux/tc_act/
F: include/uapi/linux/tc_ematch/
F: net/sched/
F: tools/testing/selftests/tc-testing
TC90522 MEDIA DRIVER
M: Akihiro Tsukada <tskd08@gmail.com>

View File

@ -1002,57 +1002,32 @@ static void ksz8_cfg_port_member(struct ksz_device *dev, int port, u8 member)
data &= ~PORT_VLAN_MEMBERSHIP;
data |= (member & dev->port_mask);
ksz_pwrite8(dev, port, P_MIRROR_CTRL, data);
dev->ports[port].member = member;
}
static void ksz8_port_stp_state_set(struct dsa_switch *ds, int port, u8 state)
{
struct ksz_device *dev = ds->priv;
int forward = dev->member;
struct ksz_port *p;
int member = -1;
u8 data;
p = &dev->ports[port];
ksz_pread8(dev, port, P_STP_CTRL, &data);
data &= ~(PORT_TX_ENABLE | PORT_RX_ENABLE | PORT_LEARN_DISABLE);
switch (state) {
case BR_STATE_DISABLED:
data |= PORT_LEARN_DISABLE;
if (port < dev->phy_port_cnt)
member = 0;
break;
case BR_STATE_LISTENING:
data |= (PORT_RX_ENABLE | PORT_LEARN_DISABLE);
if (port < dev->phy_port_cnt &&
p->stp_state == BR_STATE_DISABLED)
member = dev->host_mask | p->vid_member;
break;
case BR_STATE_LEARNING:
data |= PORT_RX_ENABLE;
break;
case BR_STATE_FORWARDING:
data |= (PORT_TX_ENABLE | PORT_RX_ENABLE);
/* This function is also used internally. */
if (port == dev->cpu_port)
break;
/* Port is a member of a bridge. */
if (dev->br_member & BIT(port)) {
dev->member |= BIT(port);
member = dev->member;
} else {
member = dev->host_mask | p->vid_member;
}
break;
case BR_STATE_BLOCKING:
data |= PORT_LEARN_DISABLE;
if (port < dev->phy_port_cnt &&
p->stp_state == BR_STATE_DISABLED)
member = dev->host_mask | p->vid_member;
break;
default:
dev_err(ds->dev, "invalid STP state: %d\n", state);
@ -1060,22 +1035,11 @@ static void ksz8_port_stp_state_set(struct dsa_switch *ds, int port, u8 state)
}
ksz_pwrite8(dev, port, P_STP_CTRL, data);
p = &dev->ports[port];
p->stp_state = state;
/* Port membership may share register with STP state. */
if (member >= 0 && member != p->member)
ksz8_cfg_port_member(dev, port, (u8)member);
/* Check if forwarding needs to be updated. */
if (state != BR_STATE_FORWARDING) {
if (dev->br_member & BIT(port))
dev->member &= ~BIT(port);
}
/* When topology has changed the function ksz_update_port_member
* should be called to modify port forwarding behavior.
*/
if (forward != dev->member)
ksz_update_port_member(dev, port);
ksz_update_port_member(dev, port);
}
static void ksz8_flush_dyn_mac_table(struct ksz_device *dev, int port)
@ -1341,7 +1305,7 @@ static void ksz8795_cpu_interface_select(struct ksz_device *dev, int port)
static void ksz8_port_setup(struct ksz_device *dev, int port, bool cpu_port)
{
struct ksz_port *p = &dev->ports[port];
struct dsa_switch *ds = dev->ds;
struct ksz8 *ksz8 = dev->priv;
const u32 *masks;
u8 member;
@ -1368,10 +1332,11 @@ static void ksz8_port_setup(struct ksz_device *dev, int port, bool cpu_port)
if (!ksz_is_ksz88x3(dev))
ksz8795_cpu_interface_select(dev, port);
member = dev->port_mask;
member = dsa_user_ports(ds);
} else {
member = dev->host_mask | p->vid_member;
member = BIT(dsa_upstream_port(ds, port));
}
ksz8_cfg_port_member(dev, port, member);
}
@ -1392,20 +1357,13 @@ static void ksz8_config_cpu_port(struct dsa_switch *ds)
ksz_cfg(dev, regs[S_TAIL_TAG_CTRL], masks[SW_TAIL_TAG_ENABLE], true);
p = &dev->ports[dev->cpu_port];
p->vid_member = dev->port_mask;
p->on = 1;
ksz8_port_setup(dev, dev->cpu_port, true);
dev->member = dev->host_mask;
for (i = 0; i < dev->phy_port_cnt; i++) {
p = &dev->ports[i];
/* Initialize to non-zero so that ksz_cfg_port_member() will
* be called.
*/
p->vid_member = BIT(i);
p->member = dev->port_mask;
ksz8_port_stp_state_set(ds, i, BR_STATE_DISABLED);
/* Last port may be disabled. */

View File

@ -391,7 +391,6 @@ static void ksz9477_cfg_port_member(struct ksz_device *dev, int port,
u8 member)
{
ksz_pwrite32(dev, port, REG_PORT_VLAN_MEMBERSHIP__4, member);
dev->ports[port].member = member;
}
static void ksz9477_port_stp_state_set(struct dsa_switch *ds, int port,
@ -400,8 +399,6 @@ static void ksz9477_port_stp_state_set(struct dsa_switch *ds, int port,
struct ksz_device *dev = ds->priv;
struct ksz_port *p = &dev->ports[port];
u8 data;
int member = -1;
int forward = dev->member;
ksz_pread8(dev, port, P_STP_CTRL, &data);
data &= ~(PORT_TX_ENABLE | PORT_RX_ENABLE | PORT_LEARN_DISABLE);
@ -409,40 +406,18 @@ static void ksz9477_port_stp_state_set(struct dsa_switch *ds, int port,
switch (state) {
case BR_STATE_DISABLED:
data |= PORT_LEARN_DISABLE;
if (port != dev->cpu_port)
member = 0;
break;
case BR_STATE_LISTENING:
data |= (PORT_RX_ENABLE | PORT_LEARN_DISABLE);
if (port != dev->cpu_port &&
p->stp_state == BR_STATE_DISABLED)
member = dev->host_mask | p->vid_member;
break;
case BR_STATE_LEARNING:
data |= PORT_RX_ENABLE;
break;
case BR_STATE_FORWARDING:
data |= (PORT_TX_ENABLE | PORT_RX_ENABLE);
/* This function is also used internally. */
if (port == dev->cpu_port)
break;
member = dev->host_mask | p->vid_member;
mutex_lock(&dev->dev_mutex);
/* Port is a member of a bridge. */
if (dev->br_member & (1 << port)) {
dev->member |= (1 << port);
member = dev->member;
}
mutex_unlock(&dev->dev_mutex);
break;
case BR_STATE_BLOCKING:
data |= PORT_LEARN_DISABLE;
if (port != dev->cpu_port &&
p->stp_state == BR_STATE_DISABLED)
member = dev->host_mask | p->vid_member;
break;
default:
dev_err(ds->dev, "invalid STP state: %d\n", state);
@ -451,23 +426,8 @@ static void ksz9477_port_stp_state_set(struct dsa_switch *ds, int port,
ksz_pwrite8(dev, port, P_STP_CTRL, data);
p->stp_state = state;
mutex_lock(&dev->dev_mutex);
/* Port membership may share register with STP state. */
if (member >= 0 && member != p->member)
ksz9477_cfg_port_member(dev, port, (u8)member);
/* Check if forwarding needs to be updated. */
if (state != BR_STATE_FORWARDING) {
if (dev->br_member & (1 << port))
dev->member &= ~(1 << port);
}
/* When topology has changed the function ksz_update_port_member
* should be called to modify port forwarding behavior.
*/
if (forward != dev->member)
ksz_update_port_member(dev, port);
mutex_unlock(&dev->dev_mutex);
ksz_update_port_member(dev, port);
}
static void ksz9477_flush_dyn_mac_table(struct ksz_device *dev, int port)
@ -1168,10 +1128,10 @@ static void ksz9477_phy_errata_setup(struct ksz_device *dev, int port)
static void ksz9477_port_setup(struct ksz_device *dev, int port, bool cpu_port)
{
u8 data8;
u8 member;
u16 data16;
struct ksz_port *p = &dev->ports[port];
struct dsa_switch *ds = dev->ds;
u8 data8, member;
u16 data16;
/* enable tag tail for host port */
if (cpu_port)
@ -1250,12 +1210,12 @@ static void ksz9477_port_setup(struct ksz_device *dev, int port, bool cpu_port)
ksz_pwrite8(dev, port, REG_PORT_XMII_CTRL_1, data8);
p->phydev.duplex = 1;
}
mutex_lock(&dev->dev_mutex);
if (cpu_port)
member = dev->port_mask;
member = dsa_user_ports(ds);
else
member = dev->host_mask | p->vid_member;
mutex_unlock(&dev->dev_mutex);
member = BIT(dsa_upstream_port(ds, port));
ksz9477_cfg_port_member(dev, port, member);
/* clear pending interrupts */
@ -1276,8 +1236,6 @@ static void ksz9477_config_cpu_port(struct dsa_switch *ds)
const char *prev_mode;
dev->cpu_port = i;
dev->host_mask = (1 << dev->cpu_port);
dev->port_mask |= dev->host_mask;
p = &dev->ports[i];
/* Read from XMII register to determine host port
@ -1312,23 +1270,15 @@ static void ksz9477_config_cpu_port(struct dsa_switch *ds)
/* enable cpu port */
ksz9477_port_setup(dev, i, true);
p->vid_member = dev->port_mask;
p->on = 1;
}
}
dev->member = dev->host_mask;
for (i = 0; i < dev->port_cnt; i++) {
if (i == dev->cpu_port)
continue;
p = &dev->ports[i];
/* Initialize to non-zero so that ksz_cfg_port_member() will
* be called.
*/
p->vid_member = (1 << i);
p->member = dev->port_mask;
ksz9477_port_stp_state_set(ds, i, BR_STATE_DISABLED);
p->on = 1;
if (i < dev->phy_port_cnt)

View File

@ -22,21 +22,40 @@
void ksz_update_port_member(struct ksz_device *dev, int port)
{
struct ksz_port *p;
struct ksz_port *p = &dev->ports[port];
struct dsa_switch *ds = dev->ds;
u8 port_member = 0, cpu_port;
const struct dsa_port *dp;
int i;
for (i = 0; i < dev->port_cnt; i++) {
if (i == port || i == dev->cpu_port)
if (!dsa_is_user_port(ds, port))
return;
dp = dsa_to_port(ds, port);
cpu_port = BIT(dsa_upstream_port(ds, port));
for (i = 0; i < ds->num_ports; i++) {
const struct dsa_port *other_dp = dsa_to_port(ds, i);
struct ksz_port *other_p = &dev->ports[i];
u8 val = 0;
if (!dsa_is_user_port(ds, i))
continue;
p = &dev->ports[i];
if (!(dev->member & (1 << i)))
if (port == i)
continue;
if (!dp->bridge_dev || dp->bridge_dev != other_dp->bridge_dev)
continue;
/* Port is a member of the bridge and is forwarding. */
if (p->stp_state == BR_STATE_FORWARDING &&
p->member != dev->member)
dev->dev_ops->cfg_port_member(dev, i, dev->member);
if (other_p->stp_state == BR_STATE_FORWARDING &&
p->stp_state == BR_STATE_FORWARDING) {
val |= BIT(port);
port_member |= BIT(i);
}
dev->dev_ops->cfg_port_member(dev, i, val | cpu_port);
}
dev->dev_ops->cfg_port_member(dev, port, port_member | cpu_port);
}
EXPORT_SYMBOL_GPL(ksz_update_port_member);
@ -175,12 +194,6 @@ EXPORT_SYMBOL_GPL(ksz_get_ethtool_stats);
int ksz_port_bridge_join(struct dsa_switch *ds, int port,
struct net_device *br)
{
struct ksz_device *dev = ds->priv;
mutex_lock(&dev->dev_mutex);
dev->br_member |= (1 << port);
mutex_unlock(&dev->dev_mutex);
/* port_stp_state_set() will be called after to put the port in
* appropriate state so there is no need to do anything.
*/
@ -192,13 +205,6 @@ EXPORT_SYMBOL_GPL(ksz_port_bridge_join);
void ksz_port_bridge_leave(struct dsa_switch *ds, int port,
struct net_device *br)
{
struct ksz_device *dev = ds->priv;
mutex_lock(&dev->dev_mutex);
dev->br_member &= ~(1 << port);
dev->member &= ~(1 << port);
mutex_unlock(&dev->dev_mutex);
/* port_stp_state_set() will be called after to put the port in
* forwarding state so there is no need to do anything.
*/

View File

@ -25,8 +25,6 @@ struct ksz_port_mib {
};
struct ksz_port {
u16 member;
u16 vid_member;
bool remove_tag; /* Remove Tag flag set, for ksz8795 only */
int stp_state;
struct phy_device phydev;
@ -83,8 +81,6 @@ struct ksz_device {
struct ksz_port *ports;
struct delayed_work mib_read;
unsigned long mib_read_interval;
u16 br_member;
u16 member;
u16 mirror_rx;
u16 mirror_tx;
u32 features; /* chip specific features */

View File

@ -1256,8 +1256,12 @@ qca8k_setup(struct dsa_switch *ds)
/* Set initial MTU for every port.
* We have only have a general MTU setting. So track
* every port and set the max across all port.
* Set per port MTU to 1500 as the MTU change function
* will add the overhead and if its set to 1518 then it
* will apply the overhead again and we will end up with
* MTU of 1536 instead of 1518
*/
priv->port_mtu[i] = ETH_FRAME_LEN + ETH_FCS_LEN;
priv->port_mtu[i] = ETH_DATA_LEN;
}
/* Special GLOBAL_FC_THRESH value are needed for ar8327 switch */
@ -1433,6 +1437,12 @@ qca8k_phylink_mac_config(struct dsa_switch *ds, int port, unsigned int mode,
qca8k_write(priv, QCA8K_REG_SGMII_CTRL, val);
/* From original code is reported port instability as SGMII also
* require delay set. Apply advised values here or take them from DT.
*/
if (state->interface == PHY_INTERFACE_MODE_SGMII)
qca8k_mac_config_setup_internal_delay(priv, cpu_port_index, reg);
/* For qca8327/qca8328/qca8334/qca8338 sgmii is unique and
* falling edge is set writing in the PORT0 PAD reg
*/
@ -1455,12 +1465,6 @@ qca8k_phylink_mac_config(struct dsa_switch *ds, int port, unsigned int mode,
QCA8K_PORT0_PAD_SGMII_TXCLK_FALLING_EDGE,
val);
/* From original code is reported port instability as SGMII also
* require delay set. Apply advised values here or take them from DT.
*/
if (state->interface == PHY_INTERFACE_MODE_SGMII)
qca8k_mac_config_setup_internal_delay(priv, cpu_port_index, reg);
break;
default:
dev_err(ds->dev, "xMII mode %s not supported for port %d\n",

View File

@ -298,13 +298,14 @@ bool aq_ring_tx_clean(struct aq_ring_s *self)
}
}
if (unlikely(buff->is_eop)) {
if (unlikely(buff->is_eop && buff->skb)) {
u64_stats_update_begin(&self->stats.tx.syncp);
++self->stats.tx.packets;
self->stats.tx.bytes += buff->skb->len;
u64_stats_update_end(&self->stats.tx.syncp);
dev_kfree_skb_any(buff->skb);
buff->skb = NULL;
}
buff->pa = 0U;
buff->eop_index = 0xffffU;

View File

@ -34,7 +34,7 @@ int axspi_read_status(struct axspi_data *ax_spi, struct spi_status *status)
/* OP */
ax_spi->cmd_buf[0] = AX_SPICMD_READ_STATUS;
ret = spi_write_then_read(ax_spi->spi, ax_spi->cmd_buf, 1, (u8 *)&status, 3);
ret = spi_write_then_read(ax_spi->spi, ax_spi->cmd_buf, 1, (u8 *)status, 3);
if (ret)
dev_err(&ax_spi->spi->dev, "%s() failed: ret = %d\n", __func__, ret);
else

View File

@ -3196,6 +3196,7 @@ static int cxgb4vf_pci_probe(struct pci_dev *pdev,
}
if (adapter->registered_device_map == 0) {
dev_err(&pdev->dev, "could not register any net devices\n");
err = -EINVAL;
goto err_disable_interrupts;
}

View File

@ -1081,7 +1081,8 @@ static void hns3_dump_page_pool_info(struct hns3_enet_ring *ring,
u32 j = 0;
sprintf(result[j++], "%u", index);
sprintf(result[j++], "%u", ring->page_pool->pages_state_hold_cnt);
sprintf(result[j++], "%u",
READ_ONCE(ring->page_pool->pages_state_hold_cnt));
sprintf(result[j++], "%u",
atomic_read(&ring->page_pool->pages_state_release_cnt));
sprintf(result[j++], "%u", ring->page_pool->p.pool_size);
@ -1106,6 +1107,11 @@ hns3_dbg_page_pool_info(struct hnae3_handle *h, char *buf, int len)
return -EFAULT;
}
if (!priv->ring[h->kinfo.num_tqps].page_pool) {
dev_err(&h->pdev->dev, "page pool is not initialized\n");
return -EFAULT;
}
for (i = 0; i < ARRAY_SIZE(page_pool_info_items); i++)
result[i] = &data_str[i][0];

View File

@ -987,6 +987,7 @@ static int hns3_set_reset(struct net_device *netdev, u32 *flags)
struct hnae3_ae_dev *ae_dev = pci_get_drvdata(h->pdev);
const struct hnae3_ae_ops *ops = h->ae_algo->ops;
const struct hns3_reset_type_map *rst_type_map;
enum ethtool_reset_flags rst_flags;
u32 i, size;
if (ops->ae_dev_resetting && ops->ae_dev_resetting(h))
@ -1006,6 +1007,7 @@ static int hns3_set_reset(struct net_device *netdev, u32 *flags)
for (i = 0; i < size; i++) {
if (rst_type_map[i].rst_flags == *flags) {
rst_type = rst_type_map[i].rst_type;
rst_flags = rst_type_map[i].rst_flags;
break;
}
}
@ -1021,6 +1023,8 @@ static int hns3_set_reset(struct net_device *netdev, u32 *flags)
ops->reset_event(h->pdev, h);
*flags &= ~rst_flags;
return 0;
}

View File

@ -703,9 +703,9 @@ static int hclgevf_set_rss_tc_mode(struct hclgevf_dev *hdev, u16 rss_size)
roundup_size = ilog2(roundup_size);
for (i = 0; i < HCLGEVF_MAX_TC_NUM; i++) {
tc_valid[i] = !!(hdev->hw_tc_map & BIT(i));
tc_valid[i] = 1;
tc_size[i] = roundup_size;
tc_offset[i] = rss_size * i;
tc_offset[i] = (hdev->hw_tc_map & BIT(i)) ? rss_size * i : 0;
}
hclgevf_cmd_setup_basic_desc(&desc, HCLGEVF_OPC_RSS_TC_MODE, false);

View File

@ -305,6 +305,7 @@ struct iavf_adapter {
#define IAVF_FLAG_AQ_DEL_FDIR_FILTER BIT(26)
#define IAVF_FLAG_AQ_ADD_ADV_RSS_CFG BIT(27)
#define IAVF_FLAG_AQ_DEL_ADV_RSS_CFG BIT(28)
#define IAVF_FLAG_AQ_REQUEST_STATS BIT(29)
/* OS defined structs */
struct net_device *netdev;
@ -444,6 +445,7 @@ int iavf_up(struct iavf_adapter *adapter);
void iavf_down(struct iavf_adapter *adapter);
int iavf_process_config(struct iavf_adapter *adapter);
void iavf_schedule_reset(struct iavf_adapter *adapter);
void iavf_schedule_request_stats(struct iavf_adapter *adapter);
void iavf_reset(struct iavf_adapter *adapter);
void iavf_set_ethtool_ops(struct net_device *netdev);
void iavf_update_stats(struct iavf_adapter *adapter);
@ -501,4 +503,5 @@ void iavf_add_adv_rss_cfg(struct iavf_adapter *adapter);
void iavf_del_adv_rss_cfg(struct iavf_adapter *adapter);
struct iavf_mac_filter *iavf_add_filter(struct iavf_adapter *adapter,
const u8 *macaddr);
int iavf_lock_timeout(struct mutex *lock, unsigned int msecs);
#endif /* _IAVF_H_ */

View File

@ -354,6 +354,9 @@ static void iavf_get_ethtool_stats(struct net_device *netdev,
struct iavf_adapter *adapter = netdev_priv(netdev);
unsigned int i;
/* Explicitly request stats refresh */
iavf_schedule_request_stats(adapter);
iavf_add_ethtool_stats(&data, adapter, iavf_gstrings_stats);
rcu_read_lock();
@ -723,12 +726,31 @@ static int iavf_get_per_queue_coalesce(struct net_device *netdev, u32 queue,
*
* Change the ITR settings for a specific queue.
**/
static void iavf_set_itr_per_queue(struct iavf_adapter *adapter,
struct ethtool_coalesce *ec, int queue)
static int iavf_set_itr_per_queue(struct iavf_adapter *adapter,
struct ethtool_coalesce *ec, int queue)
{
struct iavf_ring *rx_ring = &adapter->rx_rings[queue];
struct iavf_ring *tx_ring = &adapter->tx_rings[queue];
struct iavf_q_vector *q_vector;
u16 itr_setting;
itr_setting = rx_ring->itr_setting & ~IAVF_ITR_DYNAMIC;
if (ec->rx_coalesce_usecs != itr_setting &&
ec->use_adaptive_rx_coalesce) {
netif_info(adapter, drv, adapter->netdev,
"Rx interrupt throttling cannot be changed if adaptive-rx is enabled\n");
return -EINVAL;
}
itr_setting = tx_ring->itr_setting & ~IAVF_ITR_DYNAMIC;
if (ec->tx_coalesce_usecs != itr_setting &&
ec->use_adaptive_tx_coalesce) {
netif_info(adapter, drv, adapter->netdev,
"Tx interrupt throttling cannot be changed if adaptive-tx is enabled\n");
return -EINVAL;
}
rx_ring->itr_setting = ITR_REG_ALIGN(ec->rx_coalesce_usecs);
tx_ring->itr_setting = ITR_REG_ALIGN(ec->tx_coalesce_usecs);
@ -751,6 +773,7 @@ static void iavf_set_itr_per_queue(struct iavf_adapter *adapter,
* the Tx and Rx ITR values based on the values we have entered
* into the q_vector, no need to write the values now.
*/
return 0;
}
/**
@ -792,9 +815,11 @@ static int __iavf_set_coalesce(struct net_device *netdev,
*/
if (queue < 0) {
for (i = 0; i < adapter->num_active_queues; i++)
iavf_set_itr_per_queue(adapter, ec, i);
if (iavf_set_itr_per_queue(adapter, ec, i))
return -EINVAL;
} else if (queue < adapter->num_active_queues) {
iavf_set_itr_per_queue(adapter, ec, queue);
if (iavf_set_itr_per_queue(adapter, ec, queue))
return -EINVAL;
} else {
netif_info(adapter, drv, netdev, "Invalid queue value, queue range is 0 - %d\n",
adapter->num_active_queues - 1);

View File

@ -147,7 +147,7 @@ enum iavf_status iavf_free_virt_mem_d(struct iavf_hw *hw,
*
* Returns 0 on success, negative on failure
**/
static int iavf_lock_timeout(struct mutex *lock, unsigned int msecs)
int iavf_lock_timeout(struct mutex *lock, unsigned int msecs)
{
unsigned int wait, delay = 10;
@ -174,6 +174,19 @@ void iavf_schedule_reset(struct iavf_adapter *adapter)
}
}
/**
* iavf_schedule_request_stats - Set the flags and schedule statistics request
* @adapter: board private structure
*
* Sets IAVF_FLAG_AQ_REQUEST_STATS flag so iavf_watchdog_task() will explicitly
* request and refresh ethtool stats
**/
void iavf_schedule_request_stats(struct iavf_adapter *adapter)
{
adapter->aq_required |= IAVF_FLAG_AQ_REQUEST_STATS;
mod_delayed_work(iavf_wq, &adapter->watchdog_task, 0);
}
/**
* iavf_tx_timeout - Respond to a Tx Hang
* @netdev: network interface device structure
@ -704,13 +717,11 @@ static void iavf_del_vlan(struct iavf_adapter *adapter, u16 vlan)
**/
static void iavf_restore_filters(struct iavf_adapter *adapter)
{
/* re-add all VLAN filters */
if (VLAN_ALLOWED(adapter)) {
u16 vid;
u16 vid;
for_each_set_bit(vid, adapter->vsi.active_vlans, VLAN_N_VID)
iavf_add_vlan(adapter, vid);
}
/* re-add all VLAN filters */
for_each_set_bit(vid, adapter->vsi.active_vlans, VLAN_N_VID)
iavf_add_vlan(adapter, vid);
}
/**
@ -745,9 +756,6 @@ static int iavf_vlan_rx_kill_vid(struct net_device *netdev,
{
struct iavf_adapter *adapter = netdev_priv(netdev);
if (!VLAN_ALLOWED(adapter))
return -EIO;
iavf_del_vlan(adapter, vid);
clear_bit(vid, adapter->vsi.active_vlans);
@ -1709,6 +1717,11 @@ static int iavf_process_aq_command(struct iavf_adapter *adapter)
iavf_del_adv_rss_cfg(adapter);
return 0;
}
if (adapter->aq_required & IAVF_FLAG_AQ_REQUEST_STATS) {
iavf_request_stats(adapter);
return 0;
}
return -EAGAIN;
}
@ -2173,7 +2186,6 @@ static void iavf_reset_task(struct work_struct *work)
struct net_device *netdev = adapter->netdev;
struct iavf_hw *hw = &adapter->hw;
struct iavf_mac_filter *f, *ftmp;
struct iavf_vlan_filter *vlf;
struct iavf_cloud_filter *cf;
u32 reg_val;
int i = 0, err;
@ -2254,6 +2266,7 @@ static void iavf_reset_task(struct work_struct *work)
(adapter->state == __IAVF_RESETTING));
if (running) {
netdev->flags &= ~IFF_UP;
netif_carrier_off(netdev);
netif_tx_stop_all_queues(netdev);
adapter->link_up = false;
@ -2313,11 +2326,6 @@ static void iavf_reset_task(struct work_struct *work)
list_for_each_entry(f, &adapter->mac_filter_list, list) {
f->add = true;
}
/* re-add all VLAN filters */
list_for_each_entry(vlf, &adapter->vlan_filter_list, list) {
vlf->add = true;
}
spin_unlock_bh(&adapter->mac_vlan_list_lock);
/* check if TCs are running and re-add all cloud filters */
@ -2331,7 +2339,6 @@ static void iavf_reset_task(struct work_struct *work)
spin_unlock_bh(&adapter->cloud_filter_list_lock);
adapter->aq_required |= IAVF_FLAG_AQ_ADD_MAC_FILTER;
adapter->aq_required |= IAVF_FLAG_AQ_ADD_VLAN_FILTER;
adapter->aq_required |= IAVF_FLAG_AQ_ADD_CLOUD_FILTER;
iavf_misc_irq_enable(adapter);
@ -2365,7 +2372,7 @@ static void iavf_reset_task(struct work_struct *work)
* to __IAVF_RUNNING
*/
iavf_up_complete(adapter);
netdev->flags |= IFF_UP;
iavf_irq_enable(adapter, true);
} else {
iavf_change_state(adapter, __IAVF_DOWN);
@ -2378,8 +2385,10 @@ static void iavf_reset_task(struct work_struct *work)
reset_err:
mutex_unlock(&adapter->client_lock);
mutex_unlock(&adapter->crit_lock);
if (running)
if (running) {
iavf_change_state(adapter, __IAVF_RUNNING);
netdev->flags |= IFF_UP;
}
dev_err(&adapter->pdev->dev, "failed to allocate resources during reinit\n");
iavf_close(netdev);
}
@ -3441,11 +3450,16 @@ static int iavf_set_features(struct net_device *netdev,
{
struct iavf_adapter *adapter = netdev_priv(netdev);
/* Don't allow changing VLAN_RX flag when adapter is not capable
* of VLAN offload
/* Don't allow enabling VLAN features when adapter is not capable
* of VLAN offload/filtering
*/
if (!VLAN_ALLOWED(adapter)) {
if ((netdev->features ^ features) & NETIF_F_HW_VLAN_CTAG_RX)
netdev->hw_features &= ~(NETIF_F_HW_VLAN_CTAG_RX |
NETIF_F_HW_VLAN_CTAG_TX |
NETIF_F_HW_VLAN_CTAG_FILTER);
if (features & (NETIF_F_HW_VLAN_CTAG_RX |
NETIF_F_HW_VLAN_CTAG_TX |
NETIF_F_HW_VLAN_CTAG_FILTER))
return -EINVAL;
} else if ((netdev->features ^ features) & NETIF_F_HW_VLAN_CTAG_RX) {
if (features & NETIF_F_HW_VLAN_CTAG_RX)

View File

@ -607,7 +607,7 @@ void iavf_add_vlans(struct iavf_adapter *adapter)
if (f->add)
count++;
}
if (!count) {
if (!count || !VLAN_ALLOWED(adapter)) {
adapter->aq_required &= ~IAVF_FLAG_AQ_ADD_VLAN_FILTER;
spin_unlock_bh(&adapter->mac_vlan_list_lock);
return;
@ -673,9 +673,19 @@ void iavf_del_vlans(struct iavf_adapter *adapter)
spin_lock_bh(&adapter->mac_vlan_list_lock);
list_for_each_entry(f, &adapter->vlan_filter_list, list) {
if (f->remove)
list_for_each_entry_safe(f, ftmp, &adapter->vlan_filter_list, list) {
/* since VLAN capabilities are not allowed, we dont want to send
* a VLAN delete request because it will most likely fail and
* create unnecessary errors/noise, so just free the VLAN
* filters marked for removal to enable bailing out before
* sending a virtchnl message
*/
if (f->remove && !VLAN_ALLOWED(adapter)) {
list_del(&f->list);
kfree(f);
} else if (f->remove) {
count++;
}
}
if (!count) {
adapter->aq_required &= ~IAVF_FLAG_AQ_DEL_VLAN_FILTER;
@ -784,6 +794,8 @@ void iavf_request_stats(struct iavf_adapter *adapter)
/* no error message, this isn't crucial */
return;
}
adapter->aq_required &= ~IAVF_FLAG_AQ_REQUEST_STATS;
adapter->current_op = VIRTCHNL_OP_GET_STATS;
vqs.vsi_id = adapter->vsi_res->vsi_id;
/* queue maps are ignored for this message - only the vsi is used */
@ -1722,8 +1734,37 @@ void iavf_virtchnl_completion(struct iavf_adapter *adapter,
}
spin_lock_bh(&adapter->mac_vlan_list_lock);
iavf_add_filter(adapter, adapter->hw.mac.addr);
if (VLAN_ALLOWED(adapter)) {
if (!list_empty(&adapter->vlan_filter_list)) {
struct iavf_vlan_filter *vlf;
/* re-add all VLAN filters over virtchnl */
list_for_each_entry(vlf,
&adapter->vlan_filter_list,
list)
vlf->add = true;
adapter->aq_required |=
IAVF_FLAG_AQ_ADD_VLAN_FILTER;
}
}
spin_unlock_bh(&adapter->mac_vlan_list_lock);
iavf_process_config(adapter);
/* unlock crit_lock before acquiring rtnl_lock as other
* processes holding rtnl_lock could be waiting for the same
* crit_lock
*/
mutex_unlock(&adapter->crit_lock);
rtnl_lock();
netdev_update_features(adapter->netdev);
rtnl_unlock();
if (iavf_lock_timeout(&adapter->crit_lock, 10000))
dev_warn(&adapter->pdev->dev, "failed to acquire crit_lock in %s\n",
__FUNCTION__);
}
break;
case VIRTCHNL_OP_ENABLE_QUEUES:

View File

@ -89,8 +89,13 @@ static int ice_vsi_alloc_arrays(struct ice_vsi *vsi)
if (!vsi->rx_rings)
goto err_rings;
/* XDP will have vsi->alloc_txq Tx queues as well, so double the size */
vsi->txq_map = devm_kcalloc(dev, (2 * vsi->alloc_txq),
/* txq_map needs to have enough space to track both Tx (stack) rings
* and XDP rings; at this point vsi->num_xdp_txq might not be set,
* so use num_possible_cpus() as we want to always provide XDP ring
* per CPU, regardless of queue count settings from user that might
* have come from ethtool's set_channels() callback;
*/
vsi->txq_map = devm_kcalloc(dev, (vsi->alloc_txq + num_possible_cpus()),
sizeof(*vsi->txq_map), GFP_KERNEL);
if (!vsi->txq_map)

View File

@ -2609,7 +2609,18 @@ int ice_prepare_xdp_rings(struct ice_vsi *vsi, struct bpf_prog *prog)
ice_stat_str(status));
goto clear_xdp_rings;
}
ice_vsi_assign_bpf_prog(vsi, prog);
/* assign the prog only when it's not already present on VSI;
* this flow is a subject of both ethtool -L and ndo_bpf flows;
* VSI rebuild that happens under ethtool -L can expose us to
* the bpf_prog refcount issues as we would be swapping same
* bpf_prog pointers from vsi->xdp_prog and calling bpf_prog_put
* on it as it would be treated as an 'old_prog'; for ndo_bpf
* this is not harmful as dev_xdp_install bumps the refcount
* before calling the op exposed by the driver;
*/
if (!ice_is_xdp_ena_vsi(vsi))
ice_vsi_assign_bpf_prog(vsi, prog);
return 0;
clear_xdp_rings:
@ -2785,6 +2796,11 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog,
if (xdp_ring_err)
NL_SET_ERR_MSG_MOD(extack, "Freeing XDP Tx resources failed");
} else {
/* safe to call even when prog == vsi->xdp_prog as
* dev_xdp_install in net/core/dev.c incremented prog's
* refcount so corresponding bpf_prog_put won't cause
* underflow
*/
ice_vsi_assign_bpf_prog(vsi, prog);
}

View File

@ -8026,7 +8026,7 @@ static int igb_poll(struct napi_struct *napi, int budget)
if (likely(napi_complete_done(napi, work_done)))
igb_ring_irq_enable(q_vector);
return min(work_done, budget - 1);
return work_done;
}
/**

View File

@ -5017,11 +5017,13 @@ static int mvpp2_change_mtu(struct net_device *dev, int mtu)
mtu = ALIGN(MVPP2_RX_PKT_SIZE(mtu), 8);
}
if (port->xdp_prog && mtu > MVPP2_MAX_RX_BUF_SIZE) {
netdev_err(dev, "Illegal MTU value %d (> %d) for XDP mode\n",
mtu, (int)MVPP2_MAX_RX_BUF_SIZE);
return -EINVAL;
}
if (MVPP2_RX_PKT_SIZE(mtu) > MVPP2_BM_LONG_PKT_SIZE) {
if (port->xdp_prog) {
netdev_err(dev, "Jumbo frames are not supported with XDP\n");
return -EINVAL;
}
if (priv->percpu_pools) {
netdev_warn(dev, "mtu %d too high, switching to shared buffers", mtu);
mvpp2_bm_switch_buffers(priv, false);
@ -5307,8 +5309,8 @@ static int mvpp2_xdp_setup(struct mvpp2_port *port, struct netdev_bpf *bpf)
bool running = netif_running(port->dev);
bool reset = !prog != !port->xdp_prog;
if (port->dev->mtu > ETH_DATA_LEN) {
NL_SET_ERR_MSG_MOD(bpf->extack, "XDP is not supported with jumbo frames enabled");
if (port->dev->mtu > MVPP2_MAX_RX_BUF_SIZE) {
NL_SET_ERR_MSG_MOD(bpf->extack, "MTU too large for XDP");
return -EOPNOTSUPP;
}

View File

@ -497,8 +497,8 @@ int prestera_bridge_port_join(struct net_device *br_dev,
br_port = prestera_bridge_port_add(bridge, port->dev);
if (IS_ERR(br_port)) {
err = PTR_ERR(br_port);
goto err_brport_create;
prestera_bridge_put(bridge);
return PTR_ERR(br_port);
}
err = switchdev_bridge_port_offload(br_port->dev, port->dev, NULL,
@ -519,8 +519,6 @@ int prestera_bridge_port_join(struct net_device *br_dev,
switchdev_bridge_port_unoffload(br_port->dev, NULL, NULL, NULL);
err_switchdev_offload:
prestera_bridge_port_put(br_port);
err_brport_create:
prestera_bridge_put(bridge);
return err;
}
@ -1124,7 +1122,7 @@ static int prestera_switchdev_blk_event(struct notifier_block *unused,
prestera_port_obj_attr_set);
break;
default:
err = -EOPNOTSUPP;
return NOTIFY_DONE;
}
return notifier_from_errno(err);

View File

@ -2153,7 +2153,7 @@ static void mlxsw_sp_pude_event_func(const struct mlxsw_reg_info *reg,
max_ports = mlxsw_core_max_ports(mlxsw_sp->core);
local_port = mlxsw_reg_pude_local_port_get(pude_pl);
if (WARN_ON_ONCE(local_port >= max_ports))
if (WARN_ON_ONCE(!local_port || local_port >= max_ports))
return;
mlxsw_sp_port = mlxsw_sp->ports[local_port];
if (!mlxsw_sp_port)
@ -3290,10 +3290,10 @@ mlxsw_sp_resources_rif_mac_profile_register(struct mlxsw_core *mlxsw_core)
u8 max_rif_mac_profiles;
if (!MLXSW_CORE_RES_VALID(mlxsw_core, MAX_RIF_MAC_PROFILES))
return -EIO;
max_rif_mac_profiles = MLXSW_CORE_RES_GET(mlxsw_core,
MAX_RIF_MAC_PROFILES);
max_rif_mac_profiles = 1;
else
max_rif_mac_profiles = MLXSW_CORE_RES_GET(mlxsw_core,
MAX_RIF_MAC_PROFILES);
devlink_resource_size_params_init(&size_params, max_rif_mac_profiles,
max_rif_mac_profiles, 1,
DEVLINK_RESOURCE_UNIT_ENTRY);

View File

@ -914,8 +914,7 @@ static int lan743x_phy_reset(struct lan743x_adapter *adapter)
}
static void lan743x_phy_update_flowcontrol(struct lan743x_adapter *adapter,
u8 duplex, u16 local_adv,
u16 remote_adv)
u16 local_adv, u16 remote_adv)
{
struct lan743x_phy *phy = &adapter->phy;
u8 cap;
@ -943,7 +942,6 @@ static void lan743x_phy_link_status_change(struct net_device *netdev)
phy_print_status(phydev);
if (phydev->state == PHY_RUNNING) {
struct ethtool_link_ksettings ksettings;
int remote_advertisement = 0;
int local_advertisement = 0;
@ -980,18 +978,14 @@ static void lan743x_phy_link_status_change(struct net_device *netdev)
}
lan743x_csr_write(adapter, MAC_CR, data);
memset(&ksettings, 0, sizeof(ksettings));
phy_ethtool_get_link_ksettings(netdev, &ksettings);
local_advertisement =
linkmode_adv_to_mii_adv_t(phydev->advertising);
remote_advertisement =
linkmode_adv_to_mii_adv_t(phydev->lp_advertising);
lan743x_phy_update_flowcontrol(adapter,
ksettings.base.duplex,
local_advertisement,
lan743x_phy_update_flowcontrol(adapter, local_advertisement,
remote_advertisement);
lan743x_ptp_update_latency(adapter, ksettings.base.speed);
lan743x_ptp_update_latency(adapter, phydev->speed);
}
}

View File

@ -1278,6 +1278,225 @@ int ocelot_fdb_dump(struct ocelot *ocelot, int port,
}
EXPORT_SYMBOL(ocelot_fdb_dump);
static void ocelot_populate_l2_ptp_trap_key(struct ocelot_vcap_filter *trap)
{
trap->key_type = OCELOT_VCAP_KEY_ETYPE;
*(__be16 *)trap->key.etype.etype.value = htons(ETH_P_1588);
*(__be16 *)trap->key.etype.etype.mask = htons(0xffff);
}
static void
ocelot_populate_ipv4_ptp_event_trap_key(struct ocelot_vcap_filter *trap)
{
trap->key_type = OCELOT_VCAP_KEY_IPV4;
trap->key.ipv4.dport.value = PTP_EV_PORT;
trap->key.ipv4.dport.mask = 0xffff;
}
static void
ocelot_populate_ipv6_ptp_event_trap_key(struct ocelot_vcap_filter *trap)
{
trap->key_type = OCELOT_VCAP_KEY_IPV6;
trap->key.ipv6.dport.value = PTP_EV_PORT;
trap->key.ipv6.dport.mask = 0xffff;
}
static void
ocelot_populate_ipv4_ptp_general_trap_key(struct ocelot_vcap_filter *trap)
{
trap->key_type = OCELOT_VCAP_KEY_IPV4;
trap->key.ipv4.dport.value = PTP_GEN_PORT;
trap->key.ipv4.dport.mask = 0xffff;
}
static void
ocelot_populate_ipv6_ptp_general_trap_key(struct ocelot_vcap_filter *trap)
{
trap->key_type = OCELOT_VCAP_KEY_IPV6;
trap->key.ipv6.dport.value = PTP_GEN_PORT;
trap->key.ipv6.dport.mask = 0xffff;
}
static int ocelot_trap_add(struct ocelot *ocelot, int port,
unsigned long cookie,
void (*populate)(struct ocelot_vcap_filter *f))
{
struct ocelot_vcap_block *block_vcap_is2;
struct ocelot_vcap_filter *trap;
bool new = false;
int err;
block_vcap_is2 = &ocelot->block[VCAP_IS2];
trap = ocelot_vcap_block_find_filter_by_id(block_vcap_is2, cookie,
false);
if (!trap) {
trap = kzalloc(sizeof(*trap), GFP_KERNEL);
if (!trap)
return -ENOMEM;
populate(trap);
trap->prio = 1;
trap->id.cookie = cookie;
trap->id.tc_offload = false;
trap->block_id = VCAP_IS2;
trap->type = OCELOT_VCAP_FILTER_OFFLOAD;
trap->lookup = 0;
trap->action.cpu_copy_ena = true;
trap->action.mask_mode = OCELOT_MASK_MODE_PERMIT_DENY;
trap->action.port_mask = 0;
new = true;
}
trap->ingress_port_mask |= BIT(port);
if (new)
err = ocelot_vcap_filter_add(ocelot, trap, NULL);
else
err = ocelot_vcap_filter_replace(ocelot, trap);
if (err) {
trap->ingress_port_mask &= ~BIT(port);
if (!trap->ingress_port_mask)
kfree(trap);
return err;
}
return 0;
}
static int ocelot_trap_del(struct ocelot *ocelot, int port,
unsigned long cookie)
{
struct ocelot_vcap_block *block_vcap_is2;
struct ocelot_vcap_filter *trap;
block_vcap_is2 = &ocelot->block[VCAP_IS2];
trap = ocelot_vcap_block_find_filter_by_id(block_vcap_is2, cookie,
false);
if (!trap)
return 0;
trap->ingress_port_mask &= ~BIT(port);
if (!trap->ingress_port_mask)
return ocelot_vcap_filter_del(ocelot, trap);
return ocelot_vcap_filter_replace(ocelot, trap);
}
static int ocelot_l2_ptp_trap_add(struct ocelot *ocelot, int port)
{
unsigned long l2_cookie = ocelot->num_phys_ports + 1;
return ocelot_trap_add(ocelot, port, l2_cookie,
ocelot_populate_l2_ptp_trap_key);
}
static int ocelot_l2_ptp_trap_del(struct ocelot *ocelot, int port)
{
unsigned long l2_cookie = ocelot->num_phys_ports + 1;
return ocelot_trap_del(ocelot, port, l2_cookie);
}
static int ocelot_ipv4_ptp_trap_add(struct ocelot *ocelot, int port)
{
unsigned long ipv4_gen_cookie = ocelot->num_phys_ports + 2;
unsigned long ipv4_ev_cookie = ocelot->num_phys_ports + 3;
int err;
err = ocelot_trap_add(ocelot, port, ipv4_ev_cookie,
ocelot_populate_ipv4_ptp_event_trap_key);
if (err)
return err;
err = ocelot_trap_add(ocelot, port, ipv4_gen_cookie,
ocelot_populate_ipv4_ptp_general_trap_key);
if (err)
ocelot_trap_del(ocelot, port, ipv4_ev_cookie);
return err;
}
static int ocelot_ipv4_ptp_trap_del(struct ocelot *ocelot, int port)
{
unsigned long ipv4_gen_cookie = ocelot->num_phys_ports + 2;
unsigned long ipv4_ev_cookie = ocelot->num_phys_ports + 3;
int err;
err = ocelot_trap_del(ocelot, port, ipv4_ev_cookie);
err |= ocelot_trap_del(ocelot, port, ipv4_gen_cookie);
return err;
}
static int ocelot_ipv6_ptp_trap_add(struct ocelot *ocelot, int port)
{
unsigned long ipv6_gen_cookie = ocelot->num_phys_ports + 4;
unsigned long ipv6_ev_cookie = ocelot->num_phys_ports + 5;
int err;
err = ocelot_trap_add(ocelot, port, ipv6_ev_cookie,
ocelot_populate_ipv6_ptp_event_trap_key);
if (err)
return err;
err = ocelot_trap_add(ocelot, port, ipv6_gen_cookie,
ocelot_populate_ipv6_ptp_general_trap_key);
if (err)
ocelot_trap_del(ocelot, port, ipv6_ev_cookie);
return err;
}
static int ocelot_ipv6_ptp_trap_del(struct ocelot *ocelot, int port)
{
unsigned long ipv6_gen_cookie = ocelot->num_phys_ports + 4;
unsigned long ipv6_ev_cookie = ocelot->num_phys_ports + 5;
int err;
err = ocelot_trap_del(ocelot, port, ipv6_ev_cookie);
err |= ocelot_trap_del(ocelot, port, ipv6_gen_cookie);
return err;
}
static int ocelot_setup_ptp_traps(struct ocelot *ocelot, int port,
bool l2, bool l4)
{
int err;
if (l2)
err = ocelot_l2_ptp_trap_add(ocelot, port);
else
err = ocelot_l2_ptp_trap_del(ocelot, port);
if (err)
return err;
if (l4) {
err = ocelot_ipv4_ptp_trap_add(ocelot, port);
if (err)
goto err_ipv4;
err = ocelot_ipv6_ptp_trap_add(ocelot, port);
if (err)
goto err_ipv6;
} else {
err = ocelot_ipv4_ptp_trap_del(ocelot, port);
err |= ocelot_ipv6_ptp_trap_del(ocelot, port);
}
if (err)
return err;
return 0;
err_ipv6:
ocelot_ipv4_ptp_trap_del(ocelot, port);
err_ipv4:
if (l2)
ocelot_l2_ptp_trap_del(ocelot, port);
return err;
}
int ocelot_hwstamp_get(struct ocelot *ocelot, int port, struct ifreq *ifr)
{
return copy_to_user(ifr->ifr_data, &ocelot->hwtstamp_config,
@ -1288,7 +1507,9 @@ EXPORT_SYMBOL(ocelot_hwstamp_get);
int ocelot_hwstamp_set(struct ocelot *ocelot, int port, struct ifreq *ifr)
{
struct ocelot_port *ocelot_port = ocelot->ports[port];
bool l2 = false, l4 = false;
struct hwtstamp_config cfg;
int err;
if (copy_from_user(&cfg, ifr->ifr_data, sizeof(cfg)))
return -EFAULT;
@ -1320,28 +1541,40 @@ int ocelot_hwstamp_set(struct ocelot *ocelot, int port, struct ifreq *ifr)
switch (cfg.rx_filter) {
case HWTSTAMP_FILTER_NONE:
break;
case HWTSTAMP_FILTER_ALL:
case HWTSTAMP_FILTER_SOME:
case HWTSTAMP_FILTER_PTP_V1_L4_EVENT:
case HWTSTAMP_FILTER_PTP_V1_L4_SYNC:
case HWTSTAMP_FILTER_PTP_V1_L4_DELAY_REQ:
case HWTSTAMP_FILTER_NTP_ALL:
case HWTSTAMP_FILTER_PTP_V2_L4_EVENT:
case HWTSTAMP_FILTER_PTP_V2_L4_SYNC:
case HWTSTAMP_FILTER_PTP_V2_L4_DELAY_REQ:
l4 = true;
break;
case HWTSTAMP_FILTER_PTP_V2_L2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_L2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_L2_DELAY_REQ:
l2 = true;
break;
case HWTSTAMP_FILTER_PTP_V2_EVENT:
case HWTSTAMP_FILTER_PTP_V2_SYNC:
case HWTSTAMP_FILTER_PTP_V2_DELAY_REQ:
cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;
l2 = true;
l4 = true;
break;
default:
mutex_unlock(&ocelot->ptp_lock);
return -ERANGE;
}
err = ocelot_setup_ptp_traps(ocelot, port, l2, l4);
if (err)
return err;
if (l2 && l4)
cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;
else if (l2)
cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_L2_EVENT;
else if (l4)
cfg.rx_filter = HWTSTAMP_FILTER_PTP_V2_L4_EVENT;
else
cfg.rx_filter = HWTSTAMP_FILTER_NONE;
/* Commit back the result & save it */
memcpy(&ocelot->hwtstamp_config, &cfg, sizeof(cfg));
mutex_unlock(&ocelot->ptp_lock);
@ -1444,7 +1677,10 @@ int ocelot_get_ts_info(struct ocelot *ocelot, int port,
SOF_TIMESTAMPING_RAW_HARDWARE;
info->tx_types = BIT(HWTSTAMP_TX_OFF) | BIT(HWTSTAMP_TX_ON) |
BIT(HWTSTAMP_TX_ONESTEP_SYNC);
info->rx_filters = BIT(HWTSTAMP_FILTER_NONE) | BIT(HWTSTAMP_FILTER_ALL);
info->rx_filters = BIT(HWTSTAMP_FILTER_NONE) |
BIT(HWTSTAMP_FILTER_PTP_V2_EVENT) |
BIT(HWTSTAMP_FILTER_PTP_V2_L2_EVENT) |
BIT(HWTSTAMP_FILTER_PTP_V2_L4_EVENT);
return 0;
}

View File

@ -1217,6 +1217,22 @@ int ocelot_vcap_filter_del(struct ocelot *ocelot,
}
EXPORT_SYMBOL(ocelot_vcap_filter_del);
int ocelot_vcap_filter_replace(struct ocelot *ocelot,
struct ocelot_vcap_filter *filter)
{
struct ocelot_vcap_block *block = &ocelot->block[filter->block_id];
int index;
index = ocelot_vcap_block_get_filter_index(block, filter);
if (index < 0)
return index;
vcap_entry_set(ocelot, index, filter);
return 0;
}
EXPORT_SYMBOL(ocelot_vcap_filter_replace);
int ocelot_vcap_filter_stats_update(struct ocelot *ocelot,
struct ocelot_vcap_filter *filter)
{

View File

@ -565,7 +565,6 @@ struct nfp_net_dp {
* @exn_name: Name for Exception interrupt
* @shared_handler: Handler for shared interrupts
* @shared_name: Name for shared interrupt
* @me_freq_mhz: ME clock_freq (MHz)
* @reconfig_lock: Protects @reconfig_posted, @reconfig_timer_active,
* @reconfig_sync_present and HW reconfiguration request
* regs/machinery from async requests (sync must take
@ -650,8 +649,6 @@ struct nfp_net {
irq_handler_t shared_handler;
char shared_name[IFNAMSIZ + 8];
u32 me_freq_mhz;
bool link_up;
spinlock_t link_status_lock;

View File

@ -1344,7 +1344,7 @@ static int nfp_net_set_coalesce(struct net_device *netdev,
* ME timestamp ticks. There are 16 ME clock cycles for each timestamp
* count.
*/
factor = nn->me_freq_mhz / 16;
factor = nn->tlv_caps.me_freq_mhz / 16;
/* Each pair of (usecs, max_frames) fields specifies that interrupts
* should be coalesced until

View File

@ -1209,7 +1209,7 @@ static void *nixge_get_nvmem_address(struct device *dev)
cell = nvmem_cell_get(dev, "address");
if (IS_ERR(cell))
return NULL;
return cell;
mac = nvmem_cell_read(cell, &cell_size);
nvmem_cell_put(cell);
@ -1282,7 +1282,7 @@ static int nixge_probe(struct platform_device *pdev)
ndev->max_mtu = NIXGE_JUMBO_MTU;
mac_addr = nixge_get_nvmem_address(&pdev->dev);
if (mac_addr && is_valid_ether_addr(mac_addr)) {
if (!IS_ERR(mac_addr) && is_valid_ether_addr(mac_addr)) {
eth_hw_addr_set(ndev, mac_addr);
kfree(mac_addr);
} else {

View File

@ -1045,7 +1045,7 @@ static int qed_int_deassertion(struct qed_hwfn *p_hwfn,
if (!parities)
continue;
for (j = 0, bit_idx = 0; bit_idx < 32; j++) {
for (j = 0, bit_idx = 0; bit_idx < 32 && j < 32; j++) {
struct aeu_invert_reg_bit *p_bit = &p_aeu->bits[j];
if (qed_int_is_parity_flag(p_hwfn, p_bit) &&
@ -1083,7 +1083,7 @@ static int qed_int_deassertion(struct qed_hwfn *p_hwfn,
* to current group, making them responsible for the
* previous assertion.
*/
for (j = 0, bit_idx = 0; bit_idx < 32; j++) {
for (j = 0, bit_idx = 0; bit_idx < 32 && j < 32; j++) {
long unsigned int bitmask;
u8 bit, bit_len;
@ -1382,7 +1382,7 @@ static void qed_int_sb_attn_init(struct qed_hwfn *p_hwfn,
memset(sb_info->parity_mask, 0, sizeof(u32) * NUM_ATTN_REGS);
for (i = 0; i < NUM_ATTN_REGS; i++) {
/* j is array index, k is bit index */
for (j = 0, k = 0; k < 32; j++) {
for (j = 0, k = 0; k < 32 && j < 32; j++) {
struct aeu_invert_reg_bit *p_aeu;
p_aeu = &aeu_descs[i].bits[j];

View File

@ -5217,8 +5217,8 @@ static int rtl_get_ether_clk(struct rtl8169_private *tp)
static void rtl_init_mac_address(struct rtl8169_private *tp)
{
u8 mac_addr[ETH_ALEN] __aligned(2) = {};
struct net_device *dev = tp->dev;
u8 mac_addr[ETH_ALEN];
int rc;
rc = eth_platform_get_mac_address(tp_to_dev(tp), mac_addr);
@ -5233,7 +5233,8 @@ static void rtl_init_mac_address(struct rtl8169_private *tp)
if (is_valid_ether_addr(mac_addr))
goto done;
eth_hw_addr_random(dev);
eth_random_addr(mac_addr);
dev->addr_assign_type = NET_ADDR_RANDOM;
dev_warn(tp_to_dev(tp), "can't read MAC address, setting random one\n");
done:
eth_hw_addr_set(dev, mac_addr);

View File

@ -314,6 +314,7 @@ int stmmac_mdio_reset(struct mii_bus *mii);
int stmmac_xpcs_setup(struct mii_bus *mii);
void stmmac_set_ethtool_ops(struct net_device *netdev);
int stmmac_init_tstamp_counter(struct stmmac_priv *priv, u32 systime_flags);
void stmmac_ptp_register(struct stmmac_priv *priv);
void stmmac_ptp_unregister(struct stmmac_priv *priv);
int stmmac_open(struct net_device *dev);

View File

@ -50,6 +50,13 @@
#include "dwxgmac2.h"
#include "hwif.h"
/* As long as the interface is active, we keep the timestamping counter enabled
* with fine resolution and binary rollover. This avoid non-monotonic behavior
* (clock jumps) when changing timestamping settings at runtime.
*/
#define STMMAC_HWTS_ACTIVE (PTP_TCR_TSENA | PTP_TCR_TSCFUPDT | \
PTP_TCR_TSCTRLSSR)
#define STMMAC_ALIGN(x) ALIGN(ALIGN(x, SMP_CACHE_BYTES), 16)
#define TSO_MAX_BUFF_SIZE (SZ_16K - 1)
@ -613,8 +620,6 @@ static int stmmac_hwtstamp_set(struct net_device *dev, struct ifreq *ifr)
{
struct stmmac_priv *priv = netdev_priv(dev);
struct hwtstamp_config config;
struct timespec64 now;
u64 temp = 0;
u32 ptp_v2 = 0;
u32 tstamp_all = 0;
u32 ptp_over_ipv4_udp = 0;
@ -623,11 +628,6 @@ static int stmmac_hwtstamp_set(struct net_device *dev, struct ifreq *ifr)
u32 snap_type_sel = 0;
u32 ts_master_en = 0;
u32 ts_event_en = 0;
u32 sec_inc = 0;
u32 value = 0;
bool xmac;
xmac = priv->plat->has_gmac4 || priv->plat->has_xgmac;
if (!(priv->dma_cap.time_stamp || priv->adv_ts)) {
netdev_alert(priv->dev, "No support for HW time stamping\n");
@ -789,42 +789,17 @@ static int stmmac_hwtstamp_set(struct net_device *dev, struct ifreq *ifr)
priv->hwts_rx_en = ((config.rx_filter == HWTSTAMP_FILTER_NONE) ? 0 : 1);
priv->hwts_tx_en = config.tx_type == HWTSTAMP_TX_ON;
if (!priv->hwts_tx_en && !priv->hwts_rx_en)
stmmac_config_hw_tstamping(priv, priv->ptpaddr, 0);
else {
value = (PTP_TCR_TSENA | PTP_TCR_TSCFUPDT | PTP_TCR_TSCTRLSSR |
tstamp_all | ptp_v2 | ptp_over_ethernet |
ptp_over_ipv6_udp | ptp_over_ipv4_udp | ts_event_en |
ts_master_en | snap_type_sel);
stmmac_config_hw_tstamping(priv, priv->ptpaddr, value);
priv->systime_flags = STMMAC_HWTS_ACTIVE;
/* program Sub Second Increment reg */
stmmac_config_sub_second_increment(priv,
priv->ptpaddr, priv->plat->clk_ptp_rate,
xmac, &sec_inc);
temp = div_u64(1000000000ULL, sec_inc);
/* Store sub second increment and flags for later use */
priv->sub_second_inc = sec_inc;
priv->systime_flags = value;
/* calculate default added value:
* formula is :
* addend = (2^32)/freq_div_ratio;
* where, freq_div_ratio = 1e9ns/sec_inc
*/
temp = (u64)(temp << 32);
priv->default_addend = div_u64(temp, priv->plat->clk_ptp_rate);
stmmac_config_addend(priv, priv->ptpaddr, priv->default_addend);
/* initialize system time */
ktime_get_real_ts64(&now);
/* lower 32 bits of tv_sec are safe until y2106 */
stmmac_init_systime(priv, priv->ptpaddr,
(u32)now.tv_sec, now.tv_nsec);
if (priv->hwts_tx_en || priv->hwts_rx_en) {
priv->systime_flags |= tstamp_all | ptp_v2 |
ptp_over_ethernet | ptp_over_ipv6_udp |
ptp_over_ipv4_udp | ts_event_en |
ts_master_en | snap_type_sel;
}
stmmac_config_hw_tstamping(priv, priv->ptpaddr, priv->systime_flags);
memcpy(&priv->tstamp_config, &config, sizeof(config));
return copy_to_user(ifr->ifr_data, &config,
@ -852,6 +827,66 @@ static int stmmac_hwtstamp_get(struct net_device *dev, struct ifreq *ifr)
sizeof(*config)) ? -EFAULT : 0;
}
/**
* stmmac_init_tstamp_counter - init hardware timestamping counter
* @priv: driver private structure
* @systime_flags: timestamping flags
* Description:
* Initialize hardware counter for packet timestamping.
* This is valid as long as the interface is open and not suspended.
* Will be rerun after resuming from suspend, case in which the timestamping
* flags updated by stmmac_hwtstamp_set() also need to be restored.
*/
int stmmac_init_tstamp_counter(struct stmmac_priv *priv, u32 systime_flags)
{
bool xmac = priv->plat->has_gmac4 || priv->plat->has_xgmac;
struct timespec64 now;
u32 sec_inc = 0;
u64 temp = 0;
int ret;
if (!(priv->dma_cap.time_stamp || priv->dma_cap.atime_stamp))
return -EOPNOTSUPP;
ret = clk_prepare_enable(priv->plat->clk_ptp_ref);
if (ret < 0) {
netdev_warn(priv->dev,
"failed to enable PTP reference clock: %pe\n",
ERR_PTR(ret));
return ret;
}
stmmac_config_hw_tstamping(priv, priv->ptpaddr, systime_flags);
priv->systime_flags = systime_flags;
/* program Sub Second Increment reg */
stmmac_config_sub_second_increment(priv, priv->ptpaddr,
priv->plat->clk_ptp_rate,
xmac, &sec_inc);
temp = div_u64(1000000000ULL, sec_inc);
/* Store sub second increment for later use */
priv->sub_second_inc = sec_inc;
/* calculate default added value:
* formula is :
* addend = (2^32)/freq_div_ratio;
* where, freq_div_ratio = 1e9ns/sec_inc
*/
temp = (u64)(temp << 32);
priv->default_addend = div_u64(temp, priv->plat->clk_ptp_rate);
stmmac_config_addend(priv, priv->ptpaddr, priv->default_addend);
/* initialize system time */
ktime_get_real_ts64(&now);
/* lower 32 bits of tv_sec are safe until y2106 */
stmmac_init_systime(priv, priv->ptpaddr, (u32)now.tv_sec, now.tv_nsec);
return 0;
}
EXPORT_SYMBOL_GPL(stmmac_init_tstamp_counter);
/**
* stmmac_init_ptp - init PTP
* @priv: driver private structure
@ -862,9 +897,11 @@ static int stmmac_hwtstamp_get(struct net_device *dev, struct ifreq *ifr)
static int stmmac_init_ptp(struct stmmac_priv *priv)
{
bool xmac = priv->plat->has_gmac4 || priv->plat->has_xgmac;
int ret;
if (!(priv->dma_cap.time_stamp || priv->dma_cap.atime_stamp))
return -EOPNOTSUPP;
ret = stmmac_init_tstamp_counter(priv, STMMAC_HWTS_ACTIVE);
if (ret)
return ret;
priv->adv_ts = 0;
/* Check if adv_ts can be enabled for dwmac 4.x / xgmac core */
@ -3272,10 +3309,6 @@ static int stmmac_hw_setup(struct net_device *dev, bool init_ptp)
stmmac_mmc_setup(priv);
if (init_ptp) {
ret = clk_prepare_enable(priv->plat->clk_ptp_ref);
if (ret < 0)
netdev_warn(priv->dev, "failed to enable PTP reference clock: %d\n", ret);
ret = stmmac_init_ptp(priv);
if (ret == -EOPNOTSUPP)
netdev_warn(priv->dev, "PTP not supported by HW\n");
@ -3769,6 +3802,8 @@ int stmmac_release(struct net_device *dev)
struct stmmac_priv *priv = netdev_priv(dev);
u32 chan;
netif_tx_disable(dev);
if (device_may_wakeup(priv->device))
phylink_speed_down(priv->phylink, false);
/* Stop and disconnect the PHY */
@ -5161,12 +5196,13 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
if (likely(!(status & rx_not_ls)) &&
(likely(priv->synopsys_id >= DWMAC_CORE_4_00) ||
unlikely(status != llc_snap))) {
if (buf2_len)
if (buf2_len) {
buf2_len -= ETH_FCS_LEN;
else
len -= ETH_FCS_LEN;
} else if (buf1_len) {
buf1_len -= ETH_FCS_LEN;
len -= ETH_FCS_LEN;
len -= ETH_FCS_LEN;
}
}
if (!skb) {

View File

@ -816,7 +816,7 @@ static int __maybe_unused stmmac_pltfr_noirq_resume(struct device *dev)
if (ret)
return ret;
clk_prepare_enable(priv->plat->clk_ptp_ref);
stmmac_init_tstamp_counter(priv, priv->systime_flags);
}
return 0;

View File

@ -31,6 +31,8 @@
#define AX_MTU 236
/* some arch define END as assembly function ending, just undef it */
#undef END
/* SLIP/KISS protocol characters. */
#define END 0300 /* indicates end of frame */
#define ESC 0333 /* indicates byte stuffing */

View File

@ -661,22 +661,6 @@ void ipa_cmd_pipeline_clear_wait(struct ipa *ipa)
wait_for_completion(&ipa->completion);
}
void ipa_cmd_pipeline_clear(struct ipa *ipa)
{
u32 count = ipa_cmd_pipeline_clear_count();
struct gsi_trans *trans;
trans = ipa_cmd_trans_alloc(ipa, count);
if (trans) {
ipa_cmd_pipeline_clear_add(trans);
gsi_trans_commit_wait(trans);
ipa_cmd_pipeline_clear_wait(ipa);
} else {
dev_err(&ipa->pdev->dev,
"error allocating %u entry tag transaction\n", count);
}
}
static struct ipa_cmd_info *
ipa_cmd_info_alloc(struct ipa_endpoint *endpoint, u32 tre_count)
{

View File

@ -163,12 +163,6 @@ u32 ipa_cmd_pipeline_clear_count(void);
*/
void ipa_cmd_pipeline_clear_wait(struct ipa *ipa);
/**
* ipa_cmd_pipeline_clear() - Clear the hardware pipeline
* @ipa: - IPA pointer
*/
void ipa_cmd_pipeline_clear(struct ipa *ipa);
/**
* ipa_cmd_trans_alloc() - Allocate a transaction for the command TX endpoint
* @ipa: IPA pointer

View File

@ -1636,8 +1636,6 @@ void ipa_endpoint_suspend(struct ipa *ipa)
if (ipa->modem_netdev)
ipa_modem_suspend(ipa->modem_netdev);
ipa_cmd_pipeline_clear(ipa);
ipa_endpoint_suspend_one(ipa->name_map[IPA_ENDPOINT_AP_LAN_RX]);
ipa_endpoint_suspend_one(ipa->name_map[IPA_ENDPOINT_AP_COMMAND_TX]);
}

View File

@ -28,6 +28,7 @@
#include "ipa_reg.h"
#include "ipa_mem.h"
#include "ipa_table.h"
#include "ipa_smp2p.h"
#include "ipa_modem.h"
#include "ipa_uc.h"
#include "ipa_interrupt.h"
@ -801,6 +802,11 @@ static int ipa_remove(struct platform_device *pdev)
struct device *dev = &pdev->dev;
int ret;
/* Prevent the modem from triggering a call to ipa_setup(). This
* also ensures a modem-initiated setup that's underway completes.
*/
ipa_smp2p_irq_disable_setup(ipa);
ret = pm_runtime_get_sync(dev);
if (WARN_ON(ret < 0))
goto out_power_put;

View File

@ -339,9 +339,6 @@ int ipa_modem_stop(struct ipa *ipa)
if (state != IPA_MODEM_STATE_RUNNING)
return -EBUSY;
/* Prevent the modem from triggering a call to ipa_setup() */
ipa_smp2p_disable(ipa);
/* Clean up the netdev and endpoints if it was started */
if (netdev) {
struct ipa_priv *priv = netdev_priv(netdev);
@ -369,6 +366,9 @@ static void ipa_modem_crashed(struct ipa *ipa)
struct device *dev = &ipa->pdev->dev;
int ret;
/* Prevent the modem from triggering a call to ipa_setup() */
ipa_smp2p_irq_disable_setup(ipa);
ret = pm_runtime_get_sync(dev);
if (ret < 0) {
dev_err(dev, "error %d getting power to handle crash\n", ret);

View File

@ -53,7 +53,7 @@
* @setup_ready_irq: IPA interrupt triggered by modem to signal GSI ready
* @power_on: Whether IPA power is on
* @notified: Whether modem has been notified of power state
* @disabled: Whether setup ready interrupt handling is disabled
* @setup_disabled: Whether setup ready interrupt handler is disabled
* @mutex: Mutex protecting ready-interrupt/shutdown interlock
* @panic_notifier: Panic notifier structure
*/
@ -67,7 +67,7 @@ struct ipa_smp2p {
u32 setup_ready_irq;
bool power_on;
bool notified;
bool disabled;
bool setup_disabled;
struct mutex mutex;
struct notifier_block panic_notifier;
};
@ -155,11 +155,9 @@ static irqreturn_t ipa_smp2p_modem_setup_ready_isr(int irq, void *dev_id)
struct device *dev;
int ret;
mutex_lock(&smp2p->mutex);
if (smp2p->disabled)
goto out_mutex_unlock;
smp2p->disabled = true; /* If any others arrive, ignore them */
/* Ignore any (spurious) interrupts received after the first */
if (smp2p->ipa->setup_complete)
return IRQ_HANDLED;
/* Power needs to be active for setup */
dev = &smp2p->ipa->pdev->dev;
@ -176,8 +174,6 @@ static irqreturn_t ipa_smp2p_modem_setup_ready_isr(int irq, void *dev_id)
out_power_put:
pm_runtime_mark_last_busy(dev);
(void)pm_runtime_put_autosuspend(dev);
out_mutex_unlock:
mutex_unlock(&smp2p->mutex);
return IRQ_HANDLED;
}
@ -313,7 +309,7 @@ void ipa_smp2p_exit(struct ipa *ipa)
kfree(smp2p);
}
void ipa_smp2p_disable(struct ipa *ipa)
void ipa_smp2p_irq_disable_setup(struct ipa *ipa)
{
struct ipa_smp2p *smp2p = ipa->smp2p;
@ -322,7 +318,10 @@ void ipa_smp2p_disable(struct ipa *ipa)
mutex_lock(&smp2p->mutex);
smp2p->disabled = true;
if (!smp2p->setup_disabled) {
disable_irq(smp2p->setup_ready_irq);
smp2p->setup_disabled = true;
}
mutex_unlock(&smp2p->mutex);
}

View File

@ -27,13 +27,12 @@ int ipa_smp2p_init(struct ipa *ipa, bool modem_init);
void ipa_smp2p_exit(struct ipa *ipa);
/**
* ipa_smp2p_disable() - Prevent "ipa-setup-ready" interrupt handling
* ipa_smp2p_irq_disable_setup() - Disable the "setup ready" interrupt
* @ipa: IPA pointer
*
* Prevent handling of the "setup ready" interrupt from the modem.
* This is used before initiating shutdown of the driver.
* Disable the "ipa-setup-ready" interrupt from the modem.
*/
void ipa_smp2p_disable(struct ipa *ipa);
void ipa_smp2p_irq_disable_setup(struct ipa *ipa);
/**
* ipa_smp2p_notify_reset() - Reset modem notification state

View File

@ -61,6 +61,13 @@ static int aspeed_mdio_read(struct mii_bus *bus, int addr, int regnum)
iowrite32(ctrl, ctx->base + ASPEED_MDIO_CTRL);
rc = readl_poll_timeout(ctx->base + ASPEED_MDIO_CTRL, ctrl,
!(ctrl & ASPEED_MDIO_CTRL_FIRE),
ASPEED_MDIO_INTERVAL_US,
ASPEED_MDIO_TIMEOUT_US);
if (rc < 0)
return rc;
rc = readl_poll_timeout(ctx->base + ASPEED_MDIO_DATA, data,
data & ASPEED_MDIO_DATA_IDLE,
ASPEED_MDIO_INTERVAL_US,

View File

@ -710,6 +710,7 @@ static void phylink_resolve(struct work_struct *w)
struct phylink_link_state link_state;
struct net_device *ndev = pl->netdev;
bool mac_config = false;
bool retrigger = false;
bool cur_link_state;
mutex_lock(&pl->state_mutex);
@ -723,6 +724,7 @@ static void phylink_resolve(struct work_struct *w)
link_state.link = false;
} else if (pl->mac_link_dropped) {
link_state.link = false;
retrigger = true;
} else {
switch (pl->cur_link_an_mode) {
case MLO_AN_PHY:
@ -739,6 +741,19 @@ static void phylink_resolve(struct work_struct *w)
case MLO_AN_INBAND:
phylink_mac_pcs_get_state(pl, &link_state);
/* The PCS may have a latching link-fail indicator.
* If the link was up, bring the link down and
* re-trigger the resolve. Otherwise, re-read the
* PCS state to get the current status of the link.
*/
if (!link_state.link) {
if (cur_link_state)
retrigger = true;
else
phylink_mac_pcs_get_state(pl,
&link_state);
}
/* If we have a phy, the "up" state is the union of
* both the PHY and the MAC
*/
@ -747,6 +762,15 @@ static void phylink_resolve(struct work_struct *w)
/* Only update if the PHY link is up */
if (pl->phydev && pl->phy_state.link) {
/* If the interface has changed, force a
* link down event if the link isn't already
* down, and re-resolve.
*/
if (link_state.interface !=
pl->phy_state.interface) {
retrigger = true;
link_state.link = false;
}
link_state.interface = pl->phy_state.interface;
/* If we have a PHY, we need to update with
@ -789,7 +813,7 @@ static void phylink_resolve(struct work_struct *w)
else
phylink_link_up(pl, link_state);
}
if (!link_state.link && pl->mac_link_dropped) {
if (!link_state.link && retrigger) {
pl->mac_link_dropped = false;
queue_work(system_power_efficient_wq, &pl->resolve);
}

View File

@ -40,6 +40,8 @@
insmod -oslip_maxdev=nnn */
#define SL_MTU 296 /* 296; I am used to 600- FvK */
/* some arch define END as assembly function ending, just undef it */
#undef END
/* SLIP protocol characters. */
#define END 0300 /* indicates end of frame */
#define ESC 0333 /* indicates byte stuffing */

View File

@ -1050,6 +1050,14 @@ static const struct net_device_ops smsc95xx_netdev_ops = {
.ndo_set_features = smsc95xx_set_features,
};
static void smsc95xx_handle_link_change(struct net_device *net)
{
struct usbnet *dev = netdev_priv(net);
phy_print_status(net->phydev);
usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
}
static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
{
struct smsc95xx_priv *pdata;
@ -1154,6 +1162,17 @@ static int smsc95xx_bind(struct usbnet *dev, struct usb_interface *intf)
dev->net->min_mtu = ETH_MIN_MTU;
dev->net->max_mtu = ETH_DATA_LEN;
dev->hard_mtu = dev->net->mtu + dev->net->hard_header_len;
ret = phy_connect_direct(dev->net, pdata->phydev,
&smsc95xx_handle_link_change,
PHY_INTERFACE_MODE_MII);
if (ret) {
netdev_err(dev->net, "can't attach PHY to %s\n", pdata->mdiobus->id);
goto unregister_mdio;
}
phy_attached_info(dev->net->phydev);
return 0;
unregister_mdio:
@ -1171,47 +1190,25 @@ static void smsc95xx_unbind(struct usbnet *dev, struct usb_interface *intf)
{
struct smsc95xx_priv *pdata = dev->driver_priv;
phy_disconnect(dev->net->phydev);
mdiobus_unregister(pdata->mdiobus);
mdiobus_free(pdata->mdiobus);
netif_dbg(dev, ifdown, dev->net, "free pdata\n");
kfree(pdata);
}
static void smsc95xx_handle_link_change(struct net_device *net)
{
struct usbnet *dev = netdev_priv(net);
phy_print_status(net->phydev);
usbnet_defer_kevent(dev, EVENT_LINK_CHANGE);
}
static int smsc95xx_start_phy(struct usbnet *dev)
{
struct smsc95xx_priv *pdata = dev->driver_priv;
struct net_device *net = dev->net;
int ret;
phy_start(dev->net->phydev);
ret = smsc95xx_reset(dev);
if (ret < 0)
return ret;
ret = phy_connect_direct(net, pdata->phydev,
&smsc95xx_handle_link_change,
PHY_INTERFACE_MODE_MII);
if (ret) {
netdev_err(net, "can't attach PHY to %s\n", pdata->mdiobus->id);
return ret;
}
phy_attached_info(net->phydev);
phy_start(net->phydev);
return 0;
}
static int smsc95xx_disconnect_phy(struct usbnet *dev)
static int smsc95xx_stop(struct usbnet *dev)
{
phy_stop(dev->net->phydev);
phy_disconnect(dev->net->phydev);
if (dev->net->phydev)
phy_stop(dev->net->phydev);
return 0;
}
@ -1966,7 +1963,7 @@ static const struct driver_info smsc95xx_info = {
.unbind = smsc95xx_unbind,
.link_reset = smsc95xx_link_reset,
.reset = smsc95xx_start_phy,
.stop = smsc95xx_disconnect_phy,
.stop = smsc95xx_stop,
.rx_fixup = smsc95xx_rx_fixup,
.tx_fixup = smsc95xx_tx_fixup,
.status = smsc95xx_status,

View File

@ -202,7 +202,7 @@ static int __init virtual_ncidev_init(void)
miscdev.minor = MISC_DYNAMIC_MINOR;
miscdev.name = "virtual_nci";
miscdev.fops = &virtual_ncidev_fops;
miscdev.mode = S_IALLUGO;
miscdev.mode = 0600;
return misc_register(&miscdev);
}

View File

@ -37,6 +37,7 @@
#define PTP_MSGTYPE_PDELAY_RESP 0x3
#define PTP_EV_PORT 319
#define PTP_GEN_PORT 320
#define PTP_GEN_BIT 0x08 /* indicates general message, if set in message type */
#define OFF_PTP_SOURCE_UUID 22 /* PTPv1 only */

View File

@ -485,6 +485,7 @@ int fib6_nh_init(struct net *net, struct fib6_nh *fib6_nh,
struct fib6_config *cfg, gfp_t gfp_flags,
struct netlink_ext_ack *extack);
void fib6_nh_release(struct fib6_nh *fib6_nh);
void fib6_nh_release_dsts(struct fib6_nh *fib6_nh);
int call_fib6_entry_notifiers(struct net *net,
enum fib_event_type event_type,

View File

@ -47,6 +47,7 @@ struct ipv6_stub {
struct fib6_config *cfg, gfp_t gfp_flags,
struct netlink_ext_ack *extack);
void (*fib6_nh_release)(struct fib6_nh *fib6_nh);
void (*fib6_nh_release_dsts)(struct fib6_nh *fib6_nh);
void (*fib6_update_sernum)(struct net *net, struct fib6_info *rt);
int (*ip6_del_rt)(struct net *net, struct fib6_info *rt, bool skip_notify);
void (*fib6_rt_update)(struct net *net, struct fib6_info *rt,

View File

@ -19,6 +19,8 @@
*
*/
#include <linux/types.h>
#define NL802154_GENL_NAME "nl802154"
enum nl802154_commands {
@ -150,10 +152,9 @@ enum nl802154_attrs {
};
enum nl802154_iftype {
/* for backwards compatibility TODO */
NL802154_IFTYPE_UNSPEC = -1,
NL802154_IFTYPE_UNSPEC = (~(__u32)0),
NL802154_IFTYPE_NODE,
NL802154_IFTYPE_NODE = 0,
NL802154_IFTYPE_MONITOR,
NL802154_IFTYPE_COORD,

View File

@ -703,6 +703,8 @@ int ocelot_vcap_filter_add(struct ocelot *ocelot,
struct netlink_ext_ack *extack);
int ocelot_vcap_filter_del(struct ocelot *ocelot,
struct ocelot_vcap_filter *rule);
int ocelot_vcap_filter_replace(struct ocelot *ocelot,
struct ocelot_vcap_filter *filter);
struct ocelot_vcap_filter *
ocelot_vcap_block_find_filter_by_id(struct ocelot_vcap_block *block,
unsigned long cookie, bool tc_offload);

View File

@ -184,9 +184,6 @@ int register_vlan_dev(struct net_device *dev, struct netlink_ext_ack *extack)
if (err)
goto out_unregister_netdev;
/* Account for reference in struct vlan_dev_priv */
dev_hold(real_dev);
vlan_stacked_transfer_operstate(real_dev, dev, vlan);
linkwatch_fire_event(dev); /* _MUST_ call rfc2863_policy() */

View File

@ -615,6 +615,9 @@ static int vlan_dev_init(struct net_device *dev)
if (!vlan->vlan_pcpu_stats)
return -ENOMEM;
/* Get vlan's reference to real_dev */
dev_hold(real_dev);
return 0;
}

View File

@ -1779,6 +1779,7 @@ int neigh_table_clear(int index, struct neigh_table *tbl)
{
neigh_tables[index] = NULL;
/* It is not clean... Fix it to unload IPv6 module safely */
cancel_delayed_work_sync(&tbl->managed_work);
cancel_delayed_work_sync(&tbl->gc_work);
del_timer_sync(&tbl->proxy_timer);
pneigh_queue_purge(&tbl->proxy_queue);

View File

@ -1719,7 +1719,7 @@ static noinline_for_stack int ethtool_set_coalesce(struct net_device *dev,
struct ethtool_coalesce coalesce;
int ret;
if (!dev->ethtool_ops->set_coalesce && !dev->ethtool_ops->get_coalesce)
if (!dev->ethtool_ops->set_coalesce || !dev->ethtool_ops->get_coalesce)
return -EOPNOTSUPP;
ret = dev->ethtool_ops->get_coalesce(dev, &coalesce, &kernel_coalesce,

View File

@ -1899,15 +1899,36 @@ static void remove_nexthop(struct net *net, struct nexthop *nh,
/* if any FIB entries reference this nexthop, any dst entries
* need to be regenerated
*/
static void nh_rt_cache_flush(struct net *net, struct nexthop *nh)
static void nh_rt_cache_flush(struct net *net, struct nexthop *nh,
struct nexthop *replaced_nh)
{
struct fib6_info *f6i;
struct nh_group *nhg;
int i;
if (!list_empty(&nh->fi_list))
rt_cache_flush(net);
list_for_each_entry(f6i, &nh->f6i_list, nh_list)
ipv6_stub->fib6_update_sernum(net, f6i);
/* if an IPv6 group was replaced, we have to release all old
* dsts to make sure all refcounts are released
*/
if (!replaced_nh->is_group)
return;
/* new dsts must use only the new nexthop group */
synchronize_net();
nhg = rtnl_dereference(replaced_nh->nh_grp);
for (i = 0; i < nhg->num_nh; i++) {
struct nh_grp_entry *nhge = &nhg->nh_entries[i];
struct nh_info *nhi = rtnl_dereference(nhge->nh->nh_info);
if (nhi->family == AF_INET6)
ipv6_stub->fib6_nh_release_dsts(&nhi->fib6_nh);
}
}
static int replace_nexthop_grp(struct net *net, struct nexthop *old,
@ -2247,7 +2268,7 @@ static int replace_nexthop(struct net *net, struct nexthop *old,
err = replace_nexthop_single(net, old, new, extack);
if (!err) {
nh_rt_cache_flush(net, old);
nh_rt_cache_flush(net, old, new);
__remove_nexthop(net, new, NULL);
nexthop_put(new);
@ -2544,11 +2565,15 @@ static int nh_create_ipv6(struct net *net, struct nexthop *nh,
/* sets nh_dev if successful */
err = ipv6_stub->fib6_nh_init(net, fib6_nh, &fib6_cfg, GFP_KERNEL,
extack);
if (err)
if (err) {
/* IPv6 is not enabled, don't call fib6_nh_release */
if (err == -EAFNOSUPPORT)
goto out;
ipv6_stub->fib6_nh_release(fib6_nh);
else
} else {
nh->nh_flags = fib6_nh->fib_nh_flags;
}
out:
return err;
}

View File

@ -330,8 +330,6 @@ static void cubictcp_cong_avoid(struct sock *sk, u32 ack, u32 acked)
return;
if (tcp_in_slow_start(tp)) {
if (hystart && after(ack, ca->end_seq))
bictcp_hystart_reset(sk);
acked = tcp_slow_start(tp, acked);
if (!acked)
return;
@ -391,6 +389,9 @@ static void hystart_update(struct sock *sk, u32 delay)
struct bictcp *ca = inet_csk_ca(sk);
u32 threshold;
if (after(tp->snd_una, ca->end_seq))
bictcp_hystart_reset(sk);
if (hystart_detect & HYSTART_ACK_TRAIN) {
u32 now = bictcp_clock_us(sk);

View File

@ -1026,6 +1026,7 @@ static const struct ipv6_stub ipv6_stub_impl = {
.ip6_mtu_from_fib6 = ip6_mtu_from_fib6,
.fib6_nh_init = fib6_nh_init,
.fib6_nh_release = fib6_nh_release,
.fib6_nh_release_dsts = fib6_nh_release_dsts,
.fib6_update_sernum = fib6_update_sernum_stub,
.fib6_rt_update = fib6_rt_update,
.ip6_del_rt = ip6_del_rt,

View File

@ -174,7 +174,7 @@ static int __ip6_finish_output(struct net *net, struct sock *sk, struct sk_buff
#if defined(CONFIG_NETFILTER) && defined(CONFIG_XFRM)
/* Policy lookup after SNAT yielded a new policy */
if (skb_dst(skb)->xfrm) {
IPCB(skb)->flags |= IPSKB_REROUTED;
IP6CB(skb)->flags |= IP6SKB_REROUTED;
return dst_output(net, sk, skb);
}
#endif

View File

@ -3680,6 +3680,25 @@ void fib6_nh_release(struct fib6_nh *fib6_nh)
fib_nh_common_release(&fib6_nh->nh_common);
}
void fib6_nh_release_dsts(struct fib6_nh *fib6_nh)
{
int cpu;
if (!fib6_nh->rt6i_pcpu)
return;
for_each_possible_cpu(cpu) {
struct rt6_info *pcpu_rt, **ppcpu_rt;
ppcpu_rt = per_cpu_ptr(fib6_nh->rt6i_pcpu, cpu);
pcpu_rt = xchg(ppcpu_rt, NULL);
if (pcpu_rt) {
dst_dev_put(&pcpu_rt->dst);
dst_release(&pcpu_rt->dst);
}
}
}
static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg,
gfp_t gfp_flags,
struct netlink_ext_ack *extack)

View File

@ -422,28 +422,6 @@ bool mptcp_syn_options(struct sock *sk, const struct sk_buff *skb,
return false;
}
/* MP_JOIN client subflow must wait for 4th ack before sending any data:
* TCP can't schedule delack timer before the subflow is fully established.
* MPTCP uses the delack timer to do 3rd ack retransmissions
*/
static void schedule_3rdack_retransmission(struct sock *sk)
{
struct inet_connection_sock *icsk = inet_csk(sk);
struct tcp_sock *tp = tcp_sk(sk);
unsigned long timeout;
/* reschedule with a timeout above RTT, as we must look only for drop */
if (tp->srtt_us)
timeout = tp->srtt_us << 1;
else
timeout = TCP_TIMEOUT_INIT;
WARN_ON_ONCE(icsk->icsk_ack.pending & ICSK_ACK_TIMER);
icsk->icsk_ack.pending |= ICSK_ACK_SCHED | ICSK_ACK_TIMER;
icsk->icsk_ack.timeout = timeout;
sk_reset_timer(sk, &icsk->icsk_delack_timer, timeout);
}
static void clear_3rdack_retransmission(struct sock *sk)
{
struct inet_connection_sock *icsk = inet_csk(sk);
@ -526,7 +504,15 @@ static bool mptcp_established_options_mp(struct sock *sk, struct sk_buff *skb,
*size = TCPOLEN_MPTCP_MPJ_ACK;
pr_debug("subflow=%p", subflow);
schedule_3rdack_retransmission(sk);
/* we can use the full delegate action helper only from BH context
* If we are in process context - sk is flushing the backlog at
* socket lock release time - just set the appropriate flag, will
* be handled by the release callback
*/
if (sock_owned_by_user(sk))
set_bit(MPTCP_DELEGATE_ACK, &subflow->delegated_status);
else
mptcp_subflow_delegate(subflow, MPTCP_DELEGATE_ACK);
return true;
}
return false;

View File

@ -1596,7 +1596,8 @@ static void __mptcp_subflow_push_pending(struct sock *sk, struct sock *ssk)
if (!xmit_ssk)
goto out;
if (xmit_ssk != ssk) {
mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk));
mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk),
MPTCP_DELEGATE_SEND);
goto out;
}
@ -2943,7 +2944,7 @@ void __mptcp_check_push(struct sock *sk, struct sock *ssk)
if (xmit_ssk == ssk)
__mptcp_subflow_push_pending(sk, ssk);
else if (xmit_ssk)
mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk));
mptcp_subflow_delegate(mptcp_subflow_ctx(xmit_ssk), MPTCP_DELEGATE_SEND);
} else {
set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->flags);
}
@ -2993,18 +2994,50 @@ static void mptcp_release_cb(struct sock *sk)
__mptcp_update_rmem(sk);
}
/* MP_JOIN client subflow must wait for 4th ack before sending any data:
* TCP can't schedule delack timer before the subflow is fully established.
* MPTCP uses the delack timer to do 3rd ack retransmissions
*/
static void schedule_3rdack_retransmission(struct sock *ssk)
{
struct inet_connection_sock *icsk = inet_csk(ssk);
struct tcp_sock *tp = tcp_sk(ssk);
unsigned long timeout;
if (mptcp_subflow_ctx(ssk)->fully_established)
return;
/* reschedule with a timeout above RTT, as we must look only for drop */
if (tp->srtt_us)
timeout = usecs_to_jiffies(tp->srtt_us >> (3 - 1));
else
timeout = TCP_TIMEOUT_INIT;
timeout += jiffies;
WARN_ON_ONCE(icsk->icsk_ack.pending & ICSK_ACK_TIMER);
icsk->icsk_ack.pending |= ICSK_ACK_SCHED | ICSK_ACK_TIMER;
icsk->icsk_ack.timeout = timeout;
sk_reset_timer(ssk, &icsk->icsk_delack_timer, timeout);
}
void mptcp_subflow_process_delegated(struct sock *ssk)
{
struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(ssk);
struct sock *sk = subflow->conn;
mptcp_data_lock(sk);
if (!sock_owned_by_user(sk))
__mptcp_subflow_push_pending(sk, ssk);
else
set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->flags);
mptcp_data_unlock(sk);
mptcp_subflow_delegated_done(subflow);
if (test_bit(MPTCP_DELEGATE_SEND, &subflow->delegated_status)) {
mptcp_data_lock(sk);
if (!sock_owned_by_user(sk))
__mptcp_subflow_push_pending(sk, ssk);
else
set_bit(MPTCP_PUSH_PENDING, &mptcp_sk(sk)->flags);
mptcp_data_unlock(sk);
mptcp_subflow_delegated_done(subflow, MPTCP_DELEGATE_SEND);
}
if (test_bit(MPTCP_DELEGATE_ACK, &subflow->delegated_status)) {
schedule_3rdack_retransmission(ssk);
mptcp_subflow_delegated_done(subflow, MPTCP_DELEGATE_ACK);
}
}
static int mptcp_hash(struct sock *sk)

View File

@ -387,6 +387,7 @@ struct mptcp_delegated_action {
DECLARE_PER_CPU(struct mptcp_delegated_action, mptcp_delegated_actions);
#define MPTCP_DELEGATE_SEND 0
#define MPTCP_DELEGATE_ACK 1
/* MPTCP subflow context */
struct mptcp_subflow_context {
@ -492,23 +493,23 @@ static inline void mptcp_add_pending_subflow(struct mptcp_sock *msk,
void mptcp_subflow_process_delegated(struct sock *ssk);
static inline void mptcp_subflow_delegate(struct mptcp_subflow_context *subflow)
static inline void mptcp_subflow_delegate(struct mptcp_subflow_context *subflow, int action)
{
struct mptcp_delegated_action *delegated;
bool schedule;
/* the caller held the subflow bh socket lock */
lockdep_assert_in_softirq();
/* The implied barrier pairs with mptcp_subflow_delegated_done(), and
* ensures the below list check sees list updates done prior to status
* bit changes
*/
if (!test_and_set_bit(MPTCP_DELEGATE_SEND, &subflow->delegated_status)) {
if (!test_and_set_bit(action, &subflow->delegated_status)) {
/* still on delegated list from previous scheduling */
if (!list_empty(&subflow->delegated_node))
return;
/* the caller held the subflow bh socket lock */
lockdep_assert_in_softirq();
delegated = this_cpu_ptr(&mptcp_delegated_actions);
schedule = list_empty(&delegated->head);
list_add_tail(&subflow->delegated_node, &delegated->head);
@ -533,16 +534,16 @@ mptcp_subflow_delegated_next(struct mptcp_delegated_action *delegated)
static inline bool mptcp_subflow_has_delegated_action(const struct mptcp_subflow_context *subflow)
{
return test_bit(MPTCP_DELEGATE_SEND, &subflow->delegated_status);
return !!READ_ONCE(subflow->delegated_status);
}
static inline void mptcp_subflow_delegated_done(struct mptcp_subflow_context *subflow)
static inline void mptcp_subflow_delegated_done(struct mptcp_subflow_context *subflow, int action)
{
/* pairs with mptcp_subflow_delegate, ensures delegate_node is updated before
* touching the status bit
*/
smp_wmb();
clear_bit(MPTCP_DELEGATE_SEND, &subflow->delegated_status);
clear_bit(action, &subflow->delegated_status);
}
int mptcp_is_enabled(const struct net *net);

View File

@ -18,6 +18,8 @@
#include "internal.h"
#include "ncsi-pkt.h"
static const int padding_bytes = 26;
u32 ncsi_calculate_checksum(unsigned char *data, int len)
{
u32 checksum = 0;
@ -213,12 +215,17 @@ static int ncsi_cmd_handler_oem(struct sk_buff *skb,
{
struct ncsi_cmd_oem_pkt *cmd;
unsigned int len;
int payload;
/* NC-SI spec DSP_0222_1.2.0, section 8.2.2.2
* requires payload to be padded with 0 to
* 32-bit boundary before the checksum field.
* Ensure the padding bytes are accounted for in
* skb allocation
*/
payload = ALIGN(nca->payload, 4);
len = sizeof(struct ncsi_cmd_pkt_hdr) + 4;
if (nca->payload < 26)
len += 26;
else
len += nca->payload;
len += max(payload, padding_bytes);
cmd = skb_put_zero(skb, len);
memcpy(&cmd->mfr_id, nca->data, nca->payload);
@ -272,6 +279,7 @@ static struct ncsi_request *ncsi_alloc_command(struct ncsi_cmd_arg *nca)
struct net_device *dev = nd->dev;
int hlen = LL_RESERVED_SPACE(dev);
int tlen = dev->needed_tailroom;
int payload;
int len = hlen + tlen;
struct sk_buff *skb;
struct ncsi_request *nr;
@ -281,14 +289,14 @@ static struct ncsi_request *ncsi_alloc_command(struct ncsi_cmd_arg *nca)
return NULL;
/* NCSI command packet has 16-bytes header, payload, 4 bytes checksum.
* Payload needs padding so that the checksum field following payload is
* aligned to 32-bit boundary.
* The packet needs padding if its payload is less than 26 bytes to
* meet 64 bytes minimal ethernet frame length.
*/
len += sizeof(struct ncsi_cmd_pkt_hdr) + 4;
if (nca->payload < 26)
len += 26;
else
len += nca->payload;
payload = ALIGN(nca->payload, 4);
len += max(payload, padding_bytes);
/* Allocate skb */
skb = alloc_skb(len, GFP_ATOMIC);

View File

@ -1919,7 +1919,6 @@ ip_vs_in_hook(void *priv, struct sk_buff *skb, const struct nf_hook_state *state
struct ip_vs_proto_data *pd;
struct ip_vs_conn *cp;
int ret, pkts;
int conn_reuse_mode;
struct sock *sk;
int af = state->pf;
@ -1997,15 +1996,16 @@ ip_vs_in_hook(void *priv, struct sk_buff *skb, const struct nf_hook_state *state
cp = INDIRECT_CALL_1(pp->conn_in_get, ip_vs_conn_in_get_proto,
ipvs, af, skb, &iph);
conn_reuse_mode = sysctl_conn_reuse_mode(ipvs);
if (conn_reuse_mode && !iph.fragoffs && is_new_conn(skb, &iph) && cp) {
if (!iph.fragoffs && is_new_conn(skb, &iph) && cp) {
int conn_reuse_mode = sysctl_conn_reuse_mode(ipvs);
bool old_ct = false, resched = false;
if (unlikely(sysctl_expire_nodest_conn(ipvs)) && cp->dest &&
unlikely(!atomic_read(&cp->dest->weight))) {
resched = true;
old_ct = ip_vs_conn_uses_old_conntrack(cp, skb);
} else if (is_new_conn_expected(cp, conn_reuse_mode)) {
} else if (conn_reuse_mode &&
is_new_conn_expected(cp, conn_reuse_mode)) {
old_ct = ip_vs_conn_uses_old_conntrack(cp, skb);
if (!atomic_read(&cp->n_control)) {
resched = true;

View File

@ -1011,11 +1011,9 @@ ctnetlink_alloc_filter(const struct nlattr * const cda[], u8 family)
CTA_TUPLE_REPLY,
filter->family,
&filter->zone,
filter->orig_flags);
if (err < 0) {
err = -EINVAL;
filter->reply_flags);
if (err < 0)
goto err_filter;
}
}
return filter;

View File

@ -65,11 +65,11 @@ static void nf_flow_rule_lwt_match(struct nf_flow_match *match,
sizeof(struct in6_addr));
if (memcmp(&key->enc_ipv6.src, &in6addr_any,
sizeof(struct in6_addr)))
memset(&key->enc_ipv6.src, 0xff,
memset(&mask->enc_ipv6.src, 0xff,
sizeof(struct in6_addr));
if (memcmp(&key->enc_ipv6.dst, &in6addr_any,
sizeof(struct in6_addr)))
memset(&key->enc_ipv6.dst, 0xff,
memset(&mask->enc_ipv6.dst, 0xff,
sizeof(struct in6_addr));
enc_keys |= BIT(FLOW_DISSECTOR_KEY_ENC_IPV6_ADDRS);
key->enc_control.addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;

View File

@ -22,7 +22,6 @@
#include <linux/icmpv6.h>
#include <linux/ip.h>
#include <linux/ipv6.h>
#include <linux/ip.h>
#include <net/sctp/checksum.h>
static bool nft_payload_rebuild_vlan_hdr(const struct sk_buff *skb, int mac_off,

View File

@ -85,9 +85,9 @@ static ssize_t idletimer_tg_show(struct device *dev,
mutex_unlock(&list_mutex);
if (time_after(expires, jiffies) || ktimespec.tv_sec > 0)
return snprintf(buf, PAGE_SIZE, "%ld\n", time_diff);
return sysfs_emit(buf, "%ld\n", time_diff);
return snprintf(buf, PAGE_SIZE, "0\n");
return sysfs_emit(buf, "0\n");
}
static void idletimer_tg_work(struct work_struct *work)

View File

@ -665,12 +665,14 @@ static int ets_qdisc_change(struct Qdisc *sch, struct nlattr *opt,
q->classes[i].deficit = quanta[i];
}
}
for (i = q->nbands; i < oldbands; i++) {
qdisc_tree_flush_backlog(q->classes[i].qdisc);
if (i >= q->nstrict)
list_del(&q->classes[i].alist);
}
q->nstrict = nstrict;
memcpy(q->prio2band, priomap, sizeof(priomap));
for (i = q->nbands; i < oldbands; i++)
qdisc_tree_flush_backlog(q->classes[i].qdisc);
for (i = 0; i < q->nbands; i++)
q->classes[i].quantum = quanta[i];

View File

@ -585,7 +585,7 @@ static void smc_switch_to_fallback(struct smc_sock *smc, int reason_code)
* to clcsocket->wq during the fallback.
*/
spin_lock_irqsave(&smc_wait->lock, flags);
spin_lock(&clc_wait->lock);
spin_lock_nested(&clc_wait->lock, SINGLE_DEPTH_NESTING);
list_splice_init(&smc_wait->head, &clc_wait->head);
spin_unlock(&clc_wait->lock);
spin_unlock_irqrestore(&smc_wait->lock, flags);
@ -2134,8 +2134,10 @@ static int smc_listen(struct socket *sock, int backlog)
smc->clcsock->sk->sk_user_data =
(void *)((uintptr_t)smc | SK_USER_DATA_NOCOPY);
rc = kernel_listen(smc->clcsock, backlog);
if (rc)
if (rc) {
smc->clcsock->sk->sk_data_ready = smc->clcsk_data_ready;
goto out;
}
sk->sk_max_ack_backlog = backlog;
sk->sk_ack_backlog = 0;
sk->sk_state = SMC_LISTEN;
@ -2368,8 +2370,10 @@ static __poll_t smc_poll(struct file *file, struct socket *sock,
static int smc_shutdown(struct socket *sock, int how)
{
struct sock *sk = sock->sk;
bool do_shutdown = true;
struct smc_sock *smc;
int rc = -EINVAL;
int old_state;
int rc1 = 0;
smc = smc_sk(sk);
@ -2396,7 +2400,11 @@ static int smc_shutdown(struct socket *sock, int how)
}
switch (how) {
case SHUT_RDWR: /* shutdown in both directions */
old_state = sk->sk_state;
rc = smc_close_active(smc);
if (old_state == SMC_ACTIVE &&
sk->sk_state == SMC_PEERCLOSEWAIT1)
do_shutdown = false;
break;
case SHUT_WR:
rc = smc_close_shutdown_write(smc);
@ -2406,7 +2414,7 @@ static int smc_shutdown(struct socket *sock, int how)
/* nothing more to do because peer is not involved */
break;
}
if (smc->clcsock)
if (do_shutdown && smc->clcsock)
rc1 = kernel_sock_shutdown(smc->clcsock, how);
/* map sock_shutdown_cmd constants to sk_shutdown value range */
sk->sk_shutdown |= how + 1;

View File

@ -228,6 +228,12 @@ int smc_close_active(struct smc_sock *smc)
/* send close request */
rc = smc_close_final(conn);
sk->sk_state = SMC_PEERCLOSEWAIT1;
/* actively shutdown clcsock before peer close it,
* prevent peer from entering TIME_WAIT state.
*/
if (smc->clcsock && smc->clcsock->sk)
rc = kernel_sock_shutdown(smc->clcsock, SHUT_RDWR);
} else {
/* peer event has changed the state */
goto again;
@ -354,9 +360,9 @@ static void smc_close_passive_work(struct work_struct *work)
if (rxflags->peer_conn_abort) {
/* peer has not received all data */
smc_close_passive_abort_received(smc);
release_sock(&smc->sk);
release_sock(sk);
cancel_delayed_work_sync(&conn->tx_work);
lock_sock(&smc->sk);
lock_sock(sk);
goto wakeup;
}

View File

@ -1672,14 +1672,26 @@ static void smc_link_down_work(struct work_struct *work)
mutex_unlock(&lgr->llc_conf_mutex);
}
/* Determine vlan of internal TCP socket.
* @vlan_id: address to store the determined vlan id into
*/
static int smc_vlan_by_tcpsk_walk(struct net_device *lower_dev,
struct netdev_nested_priv *priv)
{
unsigned short *vlan_id = (unsigned short *)priv->data;
if (is_vlan_dev(lower_dev)) {
*vlan_id = vlan_dev_vlan_id(lower_dev);
return 1;
}
return 0;
}
/* Determine vlan of internal TCP socket. */
int smc_vlan_by_tcpsk(struct socket *clcsock, struct smc_init_info *ini)
{
struct dst_entry *dst = sk_dst_get(clcsock->sk);
struct netdev_nested_priv priv;
struct net_device *ndev;
int i, nest_lvl, rc = 0;
int rc = 0;
ini->vlan_id = 0;
if (!dst) {
@ -1697,20 +1709,9 @@ int smc_vlan_by_tcpsk(struct socket *clcsock, struct smc_init_info *ini)
goto out_rel;
}
priv.data = (void *)&ini->vlan_id;
rtnl_lock();
nest_lvl = ndev->lower_level;
for (i = 0; i < nest_lvl; i++) {
struct list_head *lower = &ndev->adj_list.lower;
if (list_empty(lower))
break;
lower = lower->next;
ndev = (struct net_device *)netdev_lower_get_next(ndev, &lower);
if (is_vlan_dev(ndev)) {
ini->vlan_id = vlan_dev_vlan_id(ndev);
break;
}
}
netdev_walk_all_lower_dev(ndev, smc_vlan_by_tcpsk_walk, &priv);
rtnl_unlock();
out_rel:

View File

@ -61,7 +61,7 @@ static DEFINE_MUTEX(tcpv6_prot_mutex);
static const struct proto *saved_tcpv4_prot;
static DEFINE_MUTEX(tcpv4_prot_mutex);
static struct proto tls_prots[TLS_NUM_PROTS][TLS_NUM_CONFIG][TLS_NUM_CONFIG];
static struct proto_ops tls_sw_proto_ops;
static struct proto_ops tls_proto_ops[TLS_NUM_PROTS][TLS_NUM_CONFIG][TLS_NUM_CONFIG];
static void build_protos(struct proto prot[TLS_NUM_CONFIG][TLS_NUM_CONFIG],
const struct proto *base);
@ -71,6 +71,8 @@ void update_sk_prot(struct sock *sk, struct tls_context *ctx)
WRITE_ONCE(sk->sk_prot,
&tls_prots[ip_ver][ctx->tx_conf][ctx->rx_conf]);
WRITE_ONCE(sk->sk_socket->ops,
&tls_proto_ops[ip_ver][ctx->tx_conf][ctx->rx_conf]);
}
int wait_on_pending_writer(struct sock *sk, long *timeo)
@ -669,8 +671,6 @@ static int do_tls_setsockopt_conf(struct sock *sk, sockptr_t optval,
if (tx) {
ctx->sk_write_space = sk->sk_write_space;
sk->sk_write_space = tls_write_space;
} else {
sk->sk_socket->ops = &tls_sw_proto_ops;
}
goto out;
@ -728,6 +728,39 @@ struct tls_context *tls_ctx_create(struct sock *sk)
return ctx;
}
static void build_proto_ops(struct proto_ops ops[TLS_NUM_CONFIG][TLS_NUM_CONFIG],
const struct proto_ops *base)
{
ops[TLS_BASE][TLS_BASE] = *base;
ops[TLS_SW ][TLS_BASE] = ops[TLS_BASE][TLS_BASE];
ops[TLS_SW ][TLS_BASE].sendpage_locked = tls_sw_sendpage_locked;
ops[TLS_BASE][TLS_SW ] = ops[TLS_BASE][TLS_BASE];
ops[TLS_BASE][TLS_SW ].splice_read = tls_sw_splice_read;
ops[TLS_SW ][TLS_SW ] = ops[TLS_SW ][TLS_BASE];
ops[TLS_SW ][TLS_SW ].splice_read = tls_sw_splice_read;
#ifdef CONFIG_TLS_DEVICE
ops[TLS_HW ][TLS_BASE] = ops[TLS_BASE][TLS_BASE];
ops[TLS_HW ][TLS_BASE].sendpage_locked = NULL;
ops[TLS_HW ][TLS_SW ] = ops[TLS_BASE][TLS_SW ];
ops[TLS_HW ][TLS_SW ].sendpage_locked = NULL;
ops[TLS_BASE][TLS_HW ] = ops[TLS_BASE][TLS_SW ];
ops[TLS_SW ][TLS_HW ] = ops[TLS_SW ][TLS_SW ];
ops[TLS_HW ][TLS_HW ] = ops[TLS_HW ][TLS_SW ];
ops[TLS_HW ][TLS_HW ].sendpage_locked = NULL;
#endif
#ifdef CONFIG_TLS_TOE
ops[TLS_HW_RECORD][TLS_HW_RECORD] = *base;
#endif
}
static void tls_build_proto(struct sock *sk)
{
int ip_ver = sk->sk_family == AF_INET6 ? TLSV6 : TLSV4;
@ -739,6 +772,8 @@ static void tls_build_proto(struct sock *sk)
mutex_lock(&tcpv6_prot_mutex);
if (likely(prot != saved_tcpv6_prot)) {
build_protos(tls_prots[TLSV6], prot);
build_proto_ops(tls_proto_ops[TLSV6],
sk->sk_socket->ops);
smp_store_release(&saved_tcpv6_prot, prot);
}
mutex_unlock(&tcpv6_prot_mutex);
@ -749,6 +784,8 @@ static void tls_build_proto(struct sock *sk)
mutex_lock(&tcpv4_prot_mutex);
if (likely(prot != saved_tcpv4_prot)) {
build_protos(tls_prots[TLSV4], prot);
build_proto_ops(tls_proto_ops[TLSV4],
sk->sk_socket->ops);
smp_store_release(&saved_tcpv4_prot, prot);
}
mutex_unlock(&tcpv4_prot_mutex);
@ -959,10 +996,6 @@ static int __init tls_register(void)
if (err)
return err;
tls_sw_proto_ops = inet_stream_ops;
tls_sw_proto_ops.splice_read = tls_sw_splice_read;
tls_sw_proto_ops.sendpage_locked = tls_sw_sendpage_locked;
tls_device_init();
tcp_register_ulp(&tcp_tls_ulp_ops);

View File

@ -2005,6 +2005,7 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
struct sock *sk = sock->sk;
struct sk_buff *skb;
ssize_t copied = 0;
bool from_queue;
int err = 0;
long timeo;
int chunk;
@ -2014,25 +2015,28 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
timeo = sock_rcvtimeo(sk, flags & SPLICE_F_NONBLOCK);
skb = tls_wait_data(sk, NULL, flags & SPLICE_F_NONBLOCK, timeo, &err);
if (!skb)
goto splice_read_end;
if (!ctx->decrypted) {
err = decrypt_skb_update(sk, skb, NULL, &chunk, &zc, false);
/* splice does not support reading control messages */
if (ctx->control != TLS_RECORD_TYPE_DATA) {
err = -EINVAL;
from_queue = !skb_queue_empty(&ctx->rx_list);
if (from_queue) {
skb = __skb_dequeue(&ctx->rx_list);
} else {
skb = tls_wait_data(sk, NULL, flags & SPLICE_F_NONBLOCK, timeo,
&err);
if (!skb)
goto splice_read_end;
}
err = decrypt_skb_update(sk, skb, NULL, &chunk, &zc, false);
if (err < 0) {
tls_err_abort(sk, -EBADMSG);
goto splice_read_end;
}
ctx->decrypted = 1;
}
/* splice does not support reading control messages */
if (ctx->control != TLS_RECORD_TYPE_DATA) {
err = -EINVAL;
goto splice_read_end;
}
rxm = strp_msg(skb);
chunk = min_t(unsigned int, rxm->full_len, len);
@ -2040,7 +2044,17 @@ ssize_t tls_sw_splice_read(struct socket *sock, loff_t *ppos,
if (copied < 0)
goto splice_read_end;
tls_sw_advance_skb(sk, skb, copied);
if (!from_queue) {
ctx->recv_pkt = NULL;
__strp_unpause(&ctx->strp);
}
if (chunk < rxm->full_len) {
__skb_queue_head(&ctx->rx_list, skb);
rxm->offset += len;
rxm->full_len -= len;
} else {
consume_skb(skb);
}
splice_read_end:
release_sock(sk);

View File

@ -2882,9 +2882,6 @@ static int unix_shutdown(struct socket *sock, int mode)
unix_state_lock(sk);
sk->sk_shutdown |= mode;
if ((sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET) &&
mode == SHUTDOWN_MASK)
sk->sk_state = TCP_CLOSE;
other = unix_peer(sk);
if (other)
sock_hold(other);

View File

@ -731,6 +731,7 @@ static unsigned int features[] = {
static struct virtio_driver virtio_vsock_driver = {
.feature_table = features,
.feature_table_size = ARRAY_SIZE(features),
.suppress_used_validation = true,
.driver.name = KBUILD_MODNAME,
.driver.owner = THIS_MODULE,
.id_table = id_table,

View File

@ -7,24 +7,23 @@
/* This struct should be in sync with struct rtnl_link_stats64 */
struct rtnl_link_stats {
__u32 rx_packets; /* total packets received */
__u32 tx_packets; /* total packets transmitted */
__u32 rx_bytes; /* total bytes received */
__u32 tx_bytes; /* total bytes transmitted */
__u32 rx_errors; /* bad packets received */
__u32 tx_errors; /* packet transmit problems */
__u32 rx_dropped; /* no space in linux buffers */
__u32 tx_dropped; /* no space available in linux */
__u32 multicast; /* multicast packets received */
__u32 rx_packets;
__u32 tx_packets;
__u32 rx_bytes;
__u32 tx_bytes;
__u32 rx_errors;
__u32 tx_errors;
__u32 rx_dropped;
__u32 tx_dropped;
__u32 multicast;
__u32 collisions;
/* detailed rx_errors: */
__u32 rx_length_errors;
__u32 rx_over_errors; /* receiver ring buff overflow */
__u32 rx_crc_errors; /* recved pkt with crc error */
__u32 rx_frame_errors; /* recv'd frame alignment error */
__u32 rx_fifo_errors; /* recv'r fifo overrun */
__u32 rx_missed_errors; /* receiver missed packet */
__u32 rx_over_errors;
__u32 rx_crc_errors;
__u32 rx_frame_errors;
__u32 rx_fifo_errors;
__u32 rx_missed_errors;
/* detailed tx_errors */
__u32 tx_aborted_errors;
@ -37,29 +36,201 @@ struct rtnl_link_stats {
__u32 rx_compressed;
__u32 tx_compressed;
__u32 rx_nohandler; /* dropped, no handler found */
__u32 rx_nohandler;
};
/* The main device statistics structure */
/**
* struct rtnl_link_stats64 - The main device statistics structure.
*
* @rx_packets: Number of good packets received by the interface.
* For hardware interfaces counts all good packets received from the device
* by the host, including packets which host had to drop at various stages
* of processing (even in the driver).
*
* @tx_packets: Number of packets successfully transmitted.
* For hardware interfaces counts packets which host was able to successfully
* hand over to the device, which does not necessarily mean that packets
* had been successfully transmitted out of the device, only that device
* acknowledged it copied them out of host memory.
*
* @rx_bytes: Number of good received bytes, corresponding to @rx_packets.
*
* For IEEE 802.3 devices should count the length of Ethernet Frames
* excluding the FCS.
*
* @tx_bytes: Number of good transmitted bytes, corresponding to @tx_packets.
*
* For IEEE 802.3 devices should count the length of Ethernet Frames
* excluding the FCS.
*
* @rx_errors: Total number of bad packets received on this network device.
* This counter must include events counted by @rx_length_errors,
* @rx_crc_errors, @rx_frame_errors and other errors not otherwise
* counted.
*
* @tx_errors: Total number of transmit problems.
* This counter must include events counter by @tx_aborted_errors,
* @tx_carrier_errors, @tx_fifo_errors, @tx_heartbeat_errors,
* @tx_window_errors and other errors not otherwise counted.
*
* @rx_dropped: Number of packets received but not processed,
* e.g. due to lack of resources or unsupported protocol.
* For hardware interfaces this counter may include packets discarded
* due to L2 address filtering but should not include packets dropped
* by the device due to buffer exhaustion which are counted separately in
* @rx_missed_errors (since procfs folds those two counters together).
*
* @tx_dropped: Number of packets dropped on their way to transmission,
* e.g. due to lack of resources.
*
* @multicast: Multicast packets received.
* For hardware interfaces this statistic is commonly calculated
* at the device level (unlike @rx_packets) and therefore may include
* packets which did not reach the host.
*
* For IEEE 802.3 devices this counter may be equivalent to:
*
* - 30.3.1.1.21 aMulticastFramesReceivedOK
*
* @collisions: Number of collisions during packet transmissions.
*
* @rx_length_errors: Number of packets dropped due to invalid length.
* Part of aggregate "frame" errors in `/proc/net/dev`.
*
* For IEEE 802.3 devices this counter should be equivalent to a sum
* of the following attributes:
*
* - 30.3.1.1.23 aInRangeLengthErrors
* - 30.3.1.1.24 aOutOfRangeLengthField
* - 30.3.1.1.25 aFrameTooLongErrors
*
* @rx_over_errors: Receiver FIFO overflow event counter.
*
* Historically the count of overflow events. Such events may be
* reported in the receive descriptors or via interrupts, and may
* not correspond one-to-one with dropped packets.
*
* The recommended interpretation for high speed interfaces is -
* number of packets dropped because they did not fit into buffers
* provided by the host, e.g. packets larger than MTU or next buffer
* in the ring was not available for a scatter transfer.
*
* Part of aggregate "frame" errors in `/proc/net/dev`.
*
* This statistics was historically used interchangeably with
* @rx_fifo_errors.
*
* This statistic corresponds to hardware events and is not commonly used
* on software devices.
*
* @rx_crc_errors: Number of packets received with a CRC error.
* Part of aggregate "frame" errors in `/proc/net/dev`.
*
* For IEEE 802.3 devices this counter must be equivalent to:
*
* - 30.3.1.1.6 aFrameCheckSequenceErrors
*
* @rx_frame_errors: Receiver frame alignment errors.
* Part of aggregate "frame" errors in `/proc/net/dev`.
*
* For IEEE 802.3 devices this counter should be equivalent to:
*
* - 30.3.1.1.7 aAlignmentErrors
*
* @rx_fifo_errors: Receiver FIFO error counter.
*
* Historically the count of overflow events. Those events may be
* reported in the receive descriptors or via interrupts, and may
* not correspond one-to-one with dropped packets.
*
* This statistics was used interchangeably with @rx_over_errors.
* Not recommended for use in drivers for high speed interfaces.
*
* This statistic is used on software devices, e.g. to count software
* packet queue overflow (can) or sequencing errors (GRE).
*
* @rx_missed_errors: Count of packets missed by the host.
* Folded into the "drop" counter in `/proc/net/dev`.
*
* Counts number of packets dropped by the device due to lack
* of buffer space. This usually indicates that the host interface
* is slower than the network interface, or host is not keeping up
* with the receive packet rate.
*
* This statistic corresponds to hardware events and is not used
* on software devices.
*
* @tx_aborted_errors:
* Part of aggregate "carrier" errors in `/proc/net/dev`.
* For IEEE 802.3 devices capable of half-duplex operation this counter
* must be equivalent to:
*
* - 30.3.1.1.11 aFramesAbortedDueToXSColls
*
* High speed interfaces may use this counter as a general device
* discard counter.
*
* @tx_carrier_errors: Number of frame transmission errors due to loss
* of carrier during transmission.
* Part of aggregate "carrier" errors in `/proc/net/dev`.
*
* For IEEE 802.3 devices this counter must be equivalent to:
*
* - 30.3.1.1.13 aCarrierSenseErrors
*
* @tx_fifo_errors: Number of frame transmission errors due to device
* FIFO underrun / underflow. This condition occurs when the device
* begins transmission of a frame but is unable to deliver the
* entire frame to the transmitter in time for transmission.
* Part of aggregate "carrier" errors in `/proc/net/dev`.
*
* @tx_heartbeat_errors: Number of Heartbeat / SQE Test errors for
* old half-duplex Ethernet.
* Part of aggregate "carrier" errors in `/proc/net/dev`.
*
* For IEEE 802.3 devices possibly equivalent to:
*
* - 30.3.2.1.4 aSQETestErrors
*
* @tx_window_errors: Number of frame transmission errors due
* to late collisions (for Ethernet - after the first 64B of transmission).
* Part of aggregate "carrier" errors in `/proc/net/dev`.
*
* For IEEE 802.3 devices this counter must be equivalent to:
*
* - 30.3.1.1.10 aLateCollisions
*
* @rx_compressed: Number of correctly received compressed packets.
* This counters is only meaningful for interfaces which support
* packet compression (e.g. CSLIP, PPP).
*
* @tx_compressed: Number of transmitted compressed packets.
* This counters is only meaningful for interfaces which support
* packet compression (e.g. CSLIP, PPP).
*
* @rx_nohandler: Number of packets received on the interface
* but dropped by the networking stack because the device is
* not designated to receive packets (e.g. backup link in a bond).
*/
struct rtnl_link_stats64 {
__u64 rx_packets; /* total packets received */
__u64 tx_packets; /* total packets transmitted */
__u64 rx_bytes; /* total bytes received */
__u64 tx_bytes; /* total bytes transmitted */
__u64 rx_errors; /* bad packets received */
__u64 tx_errors; /* packet transmit problems */
__u64 rx_dropped; /* no space in linux buffers */
__u64 tx_dropped; /* no space available in linux */
__u64 multicast; /* multicast packets received */
__u64 rx_packets;
__u64 tx_packets;
__u64 rx_bytes;
__u64 tx_bytes;
__u64 rx_errors;
__u64 tx_errors;
__u64 rx_dropped;
__u64 tx_dropped;
__u64 multicast;
__u64 collisions;
/* detailed rx_errors: */
__u64 rx_length_errors;
__u64 rx_over_errors; /* receiver ring buff overflow */
__u64 rx_crc_errors; /* recved pkt with crc error */
__u64 rx_frame_errors; /* recv'd frame alignment error */
__u64 rx_fifo_errors; /* recv'r fifo overrun */
__u64 rx_missed_errors; /* receiver missed packet */
__u64 rx_over_errors;
__u64 rx_crc_errors;
__u64 rx_frame_errors;
__u64 rx_fifo_errors;
__u64 rx_missed_errors;
/* detailed tx_errors */
__u64 tx_aborted_errors;
@ -71,8 +242,7 @@ struct rtnl_link_stats64 {
/* for cslip etc */
__u64 rx_compressed;
__u64 tx_compressed;
__u64 rx_nohandler; /* dropped, no handler found */
__u64 rx_nohandler;
};
/* The struct should be in sync with struct ifmap */
@ -170,12 +340,29 @@ enum {
IFLA_PROP_LIST,
IFLA_ALT_IFNAME, /* Alternative ifname */
IFLA_PERM_ADDRESS,
IFLA_PROTO_DOWN_REASON,
/* device (sysfs) name as parent, used instead
* of IFLA_LINK where there's no parent netdev
*/
IFLA_PARENT_DEV_NAME,
IFLA_PARENT_DEV_BUS_NAME,
__IFLA_MAX
};
#define IFLA_MAX (__IFLA_MAX - 1)
enum {
IFLA_PROTO_DOWN_REASON_UNSPEC,
IFLA_PROTO_DOWN_REASON_MASK, /* u32, mask for reason bits */
IFLA_PROTO_DOWN_REASON_VALUE, /* u32, reason bit value */
__IFLA_PROTO_DOWN_REASON_CNT,
IFLA_PROTO_DOWN_REASON_MAX = __IFLA_PROTO_DOWN_REASON_CNT - 1
};
/* backwards compatibility for userspace */
#ifndef __KERNEL__
#define IFLA_RTA(r) ((struct rtattr*)(((char*)(r)) + NLMSG_ALIGN(sizeof(struct ifinfomsg))))
@ -293,6 +480,7 @@ enum {
IFLA_BR_MCAST_MLD_VERSION,
IFLA_BR_VLAN_STATS_PER_PORT,
IFLA_BR_MULTI_BOOLOPT,
IFLA_BR_MCAST_QUERIER_STATE,
__IFLA_BR_MAX,
};
@ -346,6 +534,8 @@ enum {
IFLA_BRPORT_BACKUP_PORT,
IFLA_BRPORT_MRP_RING_OPEN,
IFLA_BRPORT_MRP_IN_OPEN,
IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT,
IFLA_BRPORT_MCAST_EHT_HOSTS_CNT,
__IFLA_BRPORT_MAX
};
#define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1)
@ -433,6 +623,7 @@ enum macvlan_macaddr_mode {
};
#define MACVLAN_FLAG_NOPROMISC 1
#define MACVLAN_FLAG_NODST 2 /* skip dst macvlan if matching src macvlan */
/* VRF section */
enum {
@ -597,6 +788,18 @@ enum ifla_geneve_df {
GENEVE_DF_MAX = __GENEVE_DF_END - 1,
};
/* Bareudp section */
enum {
IFLA_BAREUDP_UNSPEC,
IFLA_BAREUDP_PORT,
IFLA_BAREUDP_ETHERTYPE,
IFLA_BAREUDP_SRCPORT_MIN,
IFLA_BAREUDP_MULTIPROTO_MODE,
__IFLA_BAREUDP_MAX
};
#define IFLA_BAREUDP_MAX (__IFLA_BAREUDP_MAX - 1)
/* PPP section */
enum {
IFLA_PPP_UNSPEC,
@ -899,7 +1102,14 @@ enum {
#define IFLA_IPOIB_MAX (__IFLA_IPOIB_MAX - 1)
/* HSR section */
/* HSR/PRP section, both uses same interface */
/* Different redundancy protocols for hsr device */
enum {
HSR_PROTOCOL_HSR,
HSR_PROTOCOL_PRP,
HSR_PROTOCOL_MAX,
};
enum {
IFLA_HSR_UNSPEC,
@ -909,6 +1119,9 @@ enum {
IFLA_HSR_SUPERVISION_ADDR, /* Supervision frame multicast addr */
IFLA_HSR_SEQ_NR,
IFLA_HSR_VERSION, /* HSR version */
IFLA_HSR_PROTOCOL, /* Indicate different protocol than
* HSR. For example PRP.
*/
__IFLA_HSR_MAX,
};
@ -1033,6 +1246,8 @@ enum {
#define RMNET_FLAGS_INGRESS_MAP_COMMANDS (1U << 1)
#define RMNET_FLAGS_INGRESS_MAP_CKSUMV4 (1U << 2)
#define RMNET_FLAGS_EGRESS_MAP_CKSUMV4 (1U << 3)
#define RMNET_FLAGS_INGRESS_MAP_CKSUMV5 (1U << 4)
#define RMNET_FLAGS_EGRESS_MAP_CKSUMV5 (1U << 5)
enum {
IFLA_RMNET_UNSPEC,
@ -1048,4 +1263,14 @@ struct ifla_rmnet_flags {
__u32 mask;
};
/* MCTP section */
enum {
IFLA_MCTP_UNSPEC,
IFLA_MCTP_NET,
__IFLA_MCTP_MAX,
};
#define IFLA_MCTP_MAX (__IFLA_MCTP_MAX - 1)
#endif /* _UAPI_LINUX_IF_LINK_H */

View File

@ -34,6 +34,7 @@ TEST_PROGS += srv6_end_dt46_l3vpn_test.sh
TEST_PROGS += srv6_end_dt4_l3vpn_test.sh
TEST_PROGS += srv6_end_dt6_l3vpn_test.sh
TEST_PROGS += vrf_strict_mode_test.sh
TEST_PROGS += arp_ndisc_evict_nocarrier.sh
TEST_PROGS_EXTENDED := in_netns.sh setup_loopback.sh setup_veth.sh
TEST_PROGS_EXTENDED += toeplitz_client.sh toeplitz.sh
TEST_GEN_FILES = socket nettest

View File

@ -629,6 +629,66 @@ ipv6_fcnal()
log_test $? 0 "Nexthops removed on admin down"
}
ipv6_grp_refs()
{
if [ ! -x "$(command -v mausezahn)" ]; then
echo "SKIP: Could not run test; need mausezahn tool"
return
fi
run_cmd "$IP link set dev veth1 up"
run_cmd "$IP link add veth1.10 link veth1 up type vlan id 10"
run_cmd "$IP link add veth1.20 link veth1 up type vlan id 20"
run_cmd "$IP -6 addr add 2001:db8:91::1/64 dev veth1.10"
run_cmd "$IP -6 addr add 2001:db8:92::1/64 dev veth1.20"
run_cmd "$IP -6 neigh add 2001:db8:91::2 lladdr 00:11:22:33:44:55 dev veth1.10"
run_cmd "$IP -6 neigh add 2001:db8:92::2 lladdr 00:11:22:33:44:55 dev veth1.20"
run_cmd "$IP nexthop add id 100 via 2001:db8:91::2 dev veth1.10"
run_cmd "$IP nexthop add id 101 via 2001:db8:92::2 dev veth1.20"
run_cmd "$IP nexthop add id 102 group 100"
run_cmd "$IP route add 2001:db8:101::1/128 nhid 102"
# create per-cpu dsts through nh 100
run_cmd "ip netns exec me mausezahn -6 veth1.10 -B 2001:db8:101::1 -A 2001:db8:91::1 -c 5 -t tcp "dp=1-1023, flags=syn" >/dev/null 2>&1"
# remove nh 100 from the group to delete the route potentially leaving
# a stale per-cpu dst which holds a reference to the nexthop's net
# device and to the IPv6 route
run_cmd "$IP nexthop replace id 102 group 101"
run_cmd "$IP route del 2001:db8:101::1/128"
# add both nexthops to the group so a reference is taken on them
run_cmd "$IP nexthop replace id 102 group 100/101"
# if the bug described in commit "net: nexthop: release IPv6 per-cpu
# dsts when replacing a nexthop group" exists at this point we have
# an unlinked IPv6 route (but not freed due to stale dst) with a
# reference over the group so we delete the group which will again
# only unlink it due to the route reference
run_cmd "$IP nexthop del id 102"
# delete the nexthop with stale dst, since we have an unlinked
# group with a ref to it and an unlinked IPv6 route with ref to the
# group, the nh will only be unlinked and not freed so the stale dst
# remains forever and we get a net device refcount imbalance
run_cmd "$IP nexthop del id 100"
# if a reference was lost this command will hang because the net device
# cannot be removed
timeout -s KILL 5 ip netns exec me ip link del veth1.10 >/dev/null 2>&1
# we can't cleanup if the command is hung trying to delete the netdev
if [ $? -eq 137 ]; then
return 1
fi
# cleanup
run_cmd "$IP link del veth1.20"
run_cmd "$IP nexthop flush"
return 0
}
ipv6_grp_fcnal()
{
local rc
@ -734,6 +794,9 @@ ipv6_grp_fcnal()
run_cmd "$IP nexthop add id 108 group 31/24"
log_test $? 2 "Nexthop group can not have a blackhole and another nexthop"
ipv6_grp_refs
log_test $? 0 "Nexthop group replace refcounts"
}
ipv6_res_grp_fcnal()

View File

@ -78,26 +78,21 @@ static void memrnd(void *s, size_t n)
*byte++ = rand();
}
FIXTURE(tls_basic)
{
int fd, cfd;
bool notls;
};
FIXTURE_SETUP(tls_basic)
static void ulp_sock_pair(struct __test_metadata *_metadata,
int *fd, int *cfd, bool *notls)
{
struct sockaddr_in addr;
socklen_t len;
int sfd, ret;
self->notls = false;
*notls = false;
len = sizeof(addr);
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = htonl(INADDR_ANY);
addr.sin_port = 0;
self->fd = socket(AF_INET, SOCK_STREAM, 0);
*fd = socket(AF_INET, SOCK_STREAM, 0);
sfd = socket(AF_INET, SOCK_STREAM, 0);
ret = bind(sfd, &addr, sizeof(addr));
@ -108,26 +103,96 @@ FIXTURE_SETUP(tls_basic)
ret = getsockname(sfd, &addr, &len);
ASSERT_EQ(ret, 0);
ret = connect(self->fd, &addr, sizeof(addr));
ret = connect(*fd, &addr, sizeof(addr));
ASSERT_EQ(ret, 0);
self->cfd = accept(sfd, &addr, &len);
ASSERT_GE(self->cfd, 0);
*cfd = accept(sfd, &addr, &len);
ASSERT_GE(*cfd, 0);
close(sfd);
ret = setsockopt(self->fd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
ret = setsockopt(*fd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
if (ret != 0) {
ASSERT_EQ(errno, ENOENT);
self->notls = true;
*notls = true;
printf("Failure setting TCP_ULP, testing without tls\n");
return;
}
ret = setsockopt(self->cfd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
ret = setsockopt(*cfd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
ASSERT_EQ(ret, 0);
}
/* Produce a basic cmsg */
static int tls_send_cmsg(int fd, unsigned char record_type,
void *data, size_t len, int flags)
{
char cbuf[CMSG_SPACE(sizeof(char))];
int cmsg_len = sizeof(char);
struct cmsghdr *cmsg;
struct msghdr msg;
struct iovec vec;
vec.iov_base = data;
vec.iov_len = len;
memset(&msg, 0, sizeof(struct msghdr));
msg.msg_iov = &vec;
msg.msg_iovlen = 1;
msg.msg_control = cbuf;
msg.msg_controllen = sizeof(cbuf);
cmsg = CMSG_FIRSTHDR(&msg);
cmsg->cmsg_level = SOL_TLS;
/* test sending non-record types. */
cmsg->cmsg_type = TLS_SET_RECORD_TYPE;
cmsg->cmsg_len = CMSG_LEN(cmsg_len);
*CMSG_DATA(cmsg) = record_type;
msg.msg_controllen = cmsg->cmsg_len;
return sendmsg(fd, &msg, flags);
}
static int tls_recv_cmsg(struct __test_metadata *_metadata,
int fd, unsigned char record_type,
void *data, size_t len, int flags)
{
char cbuf[CMSG_SPACE(sizeof(char))];
struct cmsghdr *cmsg;
unsigned char ctype;
struct msghdr msg;
struct iovec vec;
int n;
vec.iov_base = data;
vec.iov_len = len;
memset(&msg, 0, sizeof(struct msghdr));
msg.msg_iov = &vec;
msg.msg_iovlen = 1;
msg.msg_control = cbuf;
msg.msg_controllen = sizeof(cbuf);
n = recvmsg(fd, &msg, flags);
cmsg = CMSG_FIRSTHDR(&msg);
EXPECT_NE(cmsg, NULL);
EXPECT_EQ(cmsg->cmsg_level, SOL_TLS);
EXPECT_EQ(cmsg->cmsg_type, TLS_GET_RECORD_TYPE);
ctype = *((unsigned char *)CMSG_DATA(cmsg));
EXPECT_EQ(ctype, record_type);
return n;
}
FIXTURE(tls_basic)
{
int fd, cfd;
bool notls;
};
FIXTURE_SETUP(tls_basic)
{
ulp_sock_pair(_metadata, &self->fd, &self->cfd, &self->notls);
}
FIXTURE_TEARDOWN(tls_basic)
{
close(self->fd);
@ -199,60 +264,21 @@ FIXTURE_VARIANT_ADD(tls, 13_sm4_ccm)
FIXTURE_SETUP(tls)
{
struct tls_crypto_info_keys tls12;
struct sockaddr_in addr;
socklen_t len;
int sfd, ret;
self->notls = false;
len = sizeof(addr);
int ret;
tls_crypto_info_init(variant->tls_version, variant->cipher_type,
&tls12);
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = htonl(INADDR_ANY);
addr.sin_port = 0;
ulp_sock_pair(_metadata, &self->fd, &self->cfd, &self->notls);
self->fd = socket(AF_INET, SOCK_STREAM, 0);
sfd = socket(AF_INET, SOCK_STREAM, 0);
if (self->notls)
return;
ret = bind(sfd, &addr, sizeof(addr));
ASSERT_EQ(ret, 0);
ret = listen(sfd, 10);
ret = setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12, tls12.len);
ASSERT_EQ(ret, 0);
ret = getsockname(sfd, &addr, &len);
ret = setsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12, tls12.len);
ASSERT_EQ(ret, 0);
ret = connect(self->fd, &addr, sizeof(addr));
ASSERT_EQ(ret, 0);
ret = setsockopt(self->fd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
if (ret != 0) {
self->notls = true;
printf("Failure setting TCP_ULP, testing without tls\n");
}
if (!self->notls) {
ret = setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12,
tls12.len);
ASSERT_EQ(ret, 0);
}
self->cfd = accept(sfd, &addr, &len);
ASSERT_GE(self->cfd, 0);
if (!self->notls) {
ret = setsockopt(self->cfd, IPPROTO_TCP, TCP_ULP, "tls",
sizeof("tls"));
ASSERT_EQ(ret, 0);
ret = setsockopt(self->cfd, SOL_TLS, TLS_RX, &tls12,
tls12.len);
ASSERT_EQ(ret, 0);
}
close(sfd);
}
FIXTURE_TEARDOWN(tls)
@ -613,6 +639,95 @@ TEST_F(tls, splice_to_pipe)
EXPECT_EQ(memcmp(mem_send, mem_recv, send_len), 0);
}
TEST_F(tls, splice_cmsg_to_pipe)
{
char *test_str = "test_read";
char record_type = 100;
int send_len = 10;
char buf[10];
int p[2];
ASSERT_GE(pipe(p), 0);
EXPECT_EQ(tls_send_cmsg(self->fd, 100, test_str, send_len, 0), 10);
EXPECT_EQ(splice(self->cfd, NULL, p[1], NULL, send_len, 0), -1);
EXPECT_EQ(errno, EINVAL);
EXPECT_EQ(recv(self->cfd, buf, send_len, 0), -1);
EXPECT_EQ(errno, EIO);
EXPECT_EQ(tls_recv_cmsg(_metadata, self->cfd, record_type,
buf, sizeof(buf), MSG_WAITALL),
send_len);
EXPECT_EQ(memcmp(test_str, buf, send_len), 0);
}
TEST_F(tls, splice_dec_cmsg_to_pipe)
{
char *test_str = "test_read";
char record_type = 100;
int send_len = 10;
char buf[10];
int p[2];
ASSERT_GE(pipe(p), 0);
EXPECT_EQ(tls_send_cmsg(self->fd, 100, test_str, send_len, 0), 10);
EXPECT_EQ(recv(self->cfd, buf, send_len, 0), -1);
EXPECT_EQ(errno, EIO);
EXPECT_EQ(splice(self->cfd, NULL, p[1], NULL, send_len, 0), -1);
EXPECT_EQ(errno, EINVAL);
EXPECT_EQ(tls_recv_cmsg(_metadata, self->cfd, record_type,
buf, sizeof(buf), MSG_WAITALL),
send_len);
EXPECT_EQ(memcmp(test_str, buf, send_len), 0);
}
TEST_F(tls, recv_and_splice)
{
int send_len = TLS_PAYLOAD_MAX_LEN;
char mem_send[TLS_PAYLOAD_MAX_LEN];
char mem_recv[TLS_PAYLOAD_MAX_LEN];
int half = send_len / 2;
int p[2];
ASSERT_GE(pipe(p), 0);
EXPECT_EQ(send(self->fd, mem_send, send_len, 0), send_len);
/* Recv hald of the record, splice the other half */
EXPECT_EQ(recv(self->cfd, mem_recv, half, MSG_WAITALL), half);
EXPECT_EQ(splice(self->cfd, NULL, p[1], NULL, half, SPLICE_F_NONBLOCK),
half);
EXPECT_EQ(read(p[0], &mem_recv[half], half), half);
EXPECT_EQ(memcmp(mem_send, mem_recv, send_len), 0);
}
TEST_F(tls, peek_and_splice)
{
int send_len = TLS_PAYLOAD_MAX_LEN;
char mem_send[TLS_PAYLOAD_MAX_LEN];
char mem_recv[TLS_PAYLOAD_MAX_LEN];
int chunk = TLS_PAYLOAD_MAX_LEN / 4;
int n, i, p[2];
memrnd(mem_send, sizeof(mem_send));
ASSERT_GE(pipe(p), 0);
for (i = 0; i < 4; i++)
EXPECT_EQ(send(self->fd, &mem_send[chunk * i], chunk, 0),
chunk);
EXPECT_EQ(recv(self->cfd, mem_recv, chunk * 5 / 2,
MSG_WAITALL | MSG_PEEK),
chunk * 5 / 2);
EXPECT_EQ(memcmp(mem_send, mem_recv, chunk * 5 / 2), 0);
n = 0;
while (n < send_len) {
i = splice(self->cfd, NULL, p[1], NULL, send_len - n, 0);
EXPECT_GT(i, 0);
n += i;
}
EXPECT_EQ(n, send_len);
EXPECT_EQ(read(p[0], mem_recv, send_len), send_len);
EXPECT_EQ(memcmp(mem_send, mem_recv, send_len), 0);
}
TEST_F(tls, recvmsg_single)
{
char const *test_str = "test_recvmsg_single";
@ -1193,60 +1308,30 @@ TEST_F(tls, mutliproc_sendpage_writers)
TEST_F(tls, control_msg)
{
if (self->notls)
return;
char cbuf[CMSG_SPACE(sizeof(char))];
char const *test_str = "test_read";
int cmsg_len = sizeof(char);
char *test_str = "test_read";
char record_type = 100;
struct cmsghdr *cmsg;
struct msghdr msg;
int send_len = 10;
struct iovec vec;
char buf[10];
vec.iov_base = (char *)test_str;
vec.iov_len = 10;
memset(&msg, 0, sizeof(struct msghdr));
msg.msg_iov = &vec;
msg.msg_iovlen = 1;
msg.msg_control = cbuf;
msg.msg_controllen = sizeof(cbuf);
cmsg = CMSG_FIRSTHDR(&msg);
cmsg->cmsg_level = SOL_TLS;
/* test sending non-record types. */
cmsg->cmsg_type = TLS_SET_RECORD_TYPE;
cmsg->cmsg_len = CMSG_LEN(cmsg_len);
*CMSG_DATA(cmsg) = record_type;
msg.msg_controllen = cmsg->cmsg_len;
if (self->notls)
SKIP(return, "no TLS support");
EXPECT_EQ(sendmsg(self->fd, &msg, 0), send_len);
EXPECT_EQ(tls_send_cmsg(self->fd, record_type, test_str, send_len, 0),
send_len);
/* Should fail because we didn't provide a control message */
EXPECT_EQ(recv(self->cfd, buf, send_len, 0), -1);
vec.iov_base = buf;
EXPECT_EQ(recvmsg(self->cfd, &msg, MSG_WAITALL | MSG_PEEK), send_len);
cmsg = CMSG_FIRSTHDR(&msg);
EXPECT_NE(cmsg, NULL);
EXPECT_EQ(cmsg->cmsg_level, SOL_TLS);
EXPECT_EQ(cmsg->cmsg_type, TLS_GET_RECORD_TYPE);
record_type = *((unsigned char *)CMSG_DATA(cmsg));
EXPECT_EQ(record_type, 100);
EXPECT_EQ(tls_recv_cmsg(_metadata, self->cfd, record_type,
buf, sizeof(buf), MSG_WAITALL | MSG_PEEK),
send_len);
EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
/* Recv the message again without MSG_PEEK */
record_type = 0;
memset(buf, 0, sizeof(buf));
EXPECT_EQ(recvmsg(self->cfd, &msg, MSG_WAITALL), send_len);
cmsg = CMSG_FIRSTHDR(&msg);
EXPECT_NE(cmsg, NULL);
EXPECT_EQ(cmsg->cmsg_level, SOL_TLS);
EXPECT_EQ(cmsg->cmsg_type, TLS_GET_RECORD_TYPE);
record_type = *((unsigned char *)CMSG_DATA(cmsg));
EXPECT_EQ(record_type, 100);
EXPECT_EQ(tls_recv_cmsg(_metadata, self->cfd, record_type,
buf, sizeof(buf), MSG_WAITALL),
send_len);
EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
}
@ -1301,6 +1386,160 @@ TEST_F(tls, shutdown_reuse)
EXPECT_EQ(errno, EISCONN);
}
FIXTURE(tls_err)
{
int fd, cfd;
int fd2, cfd2;
bool notls;
};
FIXTURE_VARIANT(tls_err)
{
uint16_t tls_version;
};
FIXTURE_VARIANT_ADD(tls_err, 12_aes_gcm)
{
.tls_version = TLS_1_2_VERSION,
};
FIXTURE_VARIANT_ADD(tls_err, 13_aes_gcm)
{
.tls_version = TLS_1_3_VERSION,
};
FIXTURE_SETUP(tls_err)
{
struct tls_crypto_info_keys tls12;
int ret;
tls_crypto_info_init(variant->tls_version, TLS_CIPHER_AES_GCM_128,
&tls12);
ulp_sock_pair(_metadata, &self->fd, &self->cfd, &self->notls);
ulp_sock_pair(_metadata, &self->fd2, &self->cfd2, &self->notls);
if (self->notls)
return;
ret = setsockopt(self->fd, SOL_TLS, TLS_TX, &tls12, tls12.len);
ASSERT_EQ(ret, 0);
ret = setsockopt(self->cfd2, SOL_TLS, TLS_RX, &tls12, tls12.len);
ASSERT_EQ(ret, 0);
}
FIXTURE_TEARDOWN(tls_err)
{
close(self->fd);
close(self->cfd);
close(self->fd2);
close(self->cfd2);
}
TEST_F(tls_err, bad_rec)
{
char buf[64];
if (self->notls)
SKIP(return, "no TLS support");
memset(buf, 0x55, sizeof(buf));
EXPECT_EQ(send(self->fd2, buf, sizeof(buf), 0), sizeof(buf));
EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), 0), -1);
EXPECT_EQ(errno, EMSGSIZE);
EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), MSG_DONTWAIT), -1);
EXPECT_EQ(errno, EAGAIN);
}
TEST_F(tls_err, bad_auth)
{
char buf[128];
int n;
if (self->notls)
SKIP(return, "no TLS support");
memrnd(buf, sizeof(buf) / 2);
EXPECT_EQ(send(self->fd, buf, sizeof(buf) / 2, 0), sizeof(buf) / 2);
n = recv(self->cfd, buf, sizeof(buf), 0);
EXPECT_GT(n, sizeof(buf) / 2);
buf[n - 1]++;
EXPECT_EQ(send(self->fd2, buf, n, 0), n);
EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), 0), -1);
EXPECT_EQ(errno, EBADMSG);
EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), 0), -1);
EXPECT_EQ(errno, EBADMSG);
}
TEST_F(tls_err, bad_in_large_read)
{
char txt[3][64];
char cip[3][128];
char buf[3 * 128];
int i, n;
if (self->notls)
SKIP(return, "no TLS support");
/* Put 3 records in the sockets */
for (i = 0; i < 3; i++) {
memrnd(txt[i], sizeof(txt[i]));
EXPECT_EQ(send(self->fd, txt[i], sizeof(txt[i]), 0),
sizeof(txt[i]));
n = recv(self->cfd, cip[i], sizeof(cip[i]), 0);
EXPECT_GT(n, sizeof(txt[i]));
/* Break the third message */
if (i == 2)
cip[2][n - 1]++;
EXPECT_EQ(send(self->fd2, cip[i], n, 0), n);
}
/* We should be able to receive the first two messages */
EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), 0), sizeof(txt[0]) * 2);
EXPECT_EQ(memcmp(buf, txt[0], sizeof(txt[0])), 0);
EXPECT_EQ(memcmp(buf + sizeof(txt[0]), txt[1], sizeof(txt[1])), 0);
/* Third mesasge is bad */
EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), 0), -1);
EXPECT_EQ(errno, EBADMSG);
EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), 0), -1);
EXPECT_EQ(errno, EBADMSG);
}
TEST_F(tls_err, bad_cmsg)
{
char *test_str = "test_read";
int send_len = 10;
char cip[128];
char buf[128];
char txt[64];
int n;
if (self->notls)
SKIP(return, "no TLS support");
/* Queue up one data record */
memrnd(txt, sizeof(txt));
EXPECT_EQ(send(self->fd, txt, sizeof(txt), 0), sizeof(txt));
n = recv(self->cfd, cip, sizeof(cip), 0);
EXPECT_GT(n, sizeof(txt));
EXPECT_EQ(send(self->fd2, cip, n, 0), n);
EXPECT_EQ(tls_send_cmsg(self->fd, 100, test_str, send_len, 0), 10);
n = recv(self->cfd, cip, sizeof(cip), 0);
cip[n - 1]++; /* Break it */
EXPECT_GT(n, send_len);
EXPECT_EQ(send(self->fd2, cip, n, 0), n);
EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), 0), sizeof(txt));
EXPECT_EQ(memcmp(buf, txt, sizeof(txt)), 0);
EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), 0), -1);
EXPECT_EQ(errno, EBADMSG);
EXPECT_EQ(recv(self->cfd2, buf, sizeof(buf), 0), -1);
EXPECT_EQ(errno, EBADMSG);
}
TEST(non_established) {
struct tls12_crypto_info_aes_gcm_256 tls12;
struct sockaddr_in addr;
@ -1355,64 +1594,82 @@ TEST(non_established) {
TEST(keysizes) {
struct tls12_crypto_info_aes_gcm_256 tls12;
struct sockaddr_in addr;
int sfd, ret, fd, cfd;
socklen_t len;
int ret, fd, cfd;
bool notls;
notls = false;
len = sizeof(addr);
memset(&tls12, 0, sizeof(tls12));
tls12.info.version = TLS_1_2_VERSION;
tls12.info.cipher_type = TLS_CIPHER_AES_GCM_256;
addr.sin_family = AF_INET;
addr.sin_addr.s_addr = htonl(INADDR_ANY);
addr.sin_port = 0;
fd = socket(AF_INET, SOCK_STREAM, 0);
sfd = socket(AF_INET, SOCK_STREAM, 0);
ret = bind(sfd, &addr, sizeof(addr));
ASSERT_EQ(ret, 0);
ret = listen(sfd, 10);
ASSERT_EQ(ret, 0);
ret = getsockname(sfd, &addr, &len);
ASSERT_EQ(ret, 0);
ret = connect(fd, &addr, sizeof(addr));
ASSERT_EQ(ret, 0);
ret = setsockopt(fd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
if (ret != 0) {
notls = true;
printf("Failure setting TCP_ULP, testing without tls\n");
}
ulp_sock_pair(_metadata, &fd, &cfd, &notls);
if (!notls) {
ret = setsockopt(fd, SOL_TLS, TLS_TX, &tls12,
sizeof(tls12));
EXPECT_EQ(ret, 0);
}
cfd = accept(sfd, &addr, &len);
ASSERT_GE(cfd, 0);
if (!notls) {
ret = setsockopt(cfd, IPPROTO_TCP, TCP_ULP, "tls",
sizeof("tls"));
EXPECT_EQ(ret, 0);
ret = setsockopt(cfd, SOL_TLS, TLS_RX, &tls12,
sizeof(tls12));
EXPECT_EQ(ret, 0);
}
close(sfd);
close(fd);
close(cfd);
}
TEST(tls_v6ops) {
struct tls_crypto_info_keys tls12;
struct sockaddr_in6 addr, addr2;
int sfd, ret, fd;
socklen_t len, len2;
tls_crypto_info_init(TLS_1_2_VERSION, TLS_CIPHER_AES_GCM_128, &tls12);
addr.sin6_family = AF_INET6;
addr.sin6_addr = in6addr_any;
addr.sin6_port = 0;
fd = socket(AF_INET6, SOCK_STREAM, 0);
sfd = socket(AF_INET6, SOCK_STREAM, 0);
ret = bind(sfd, &addr, sizeof(addr));
ASSERT_EQ(ret, 0);
ret = listen(sfd, 10);
ASSERT_EQ(ret, 0);
len = sizeof(addr);
ret = getsockname(sfd, &addr, &len);
ASSERT_EQ(ret, 0);
ret = connect(fd, &addr, sizeof(addr));
ASSERT_EQ(ret, 0);
len = sizeof(addr);
ret = getsockname(fd, &addr, &len);
ASSERT_EQ(ret, 0);
ret = setsockopt(fd, IPPROTO_TCP, TCP_ULP, "tls", sizeof("tls"));
if (ret) {
ASSERT_EQ(errno, ENOENT);
SKIP(return, "no TLS support");
}
ASSERT_EQ(ret, 0);
ret = setsockopt(fd, SOL_TLS, TLS_TX, &tls12, tls12.len);
ASSERT_EQ(ret, 0);
ret = setsockopt(fd, SOL_TLS, TLS_RX, &tls12, tls12.len);
ASSERT_EQ(ret, 0);
len2 = sizeof(addr2);
ret = getsockname(fd, &addr2, &len2);
ASSERT_EQ(ret, 0);
EXPECT_EQ(len2, len);
EXPECT_EQ(memcmp(&addr, &addr2, len), 0);
close(fd);
close(sfd);
}
TEST_HARNESS_MAIN

View File

@ -5,7 +5,8 @@ TEST_PROGS := nft_trans_stress.sh nft_fib.sh nft_nat.sh bridge_brouter.sh \
conntrack_icmp_related.sh nft_flowtable.sh ipvs.sh \
nft_concat_range.sh nft_conntrack_helper.sh \
nft_queue.sh nft_meta.sh nf_nat_edemux.sh \
ipip-conntrack-mtu.sh conntrack_tcp_unreplied.sh
ipip-conntrack-mtu.sh conntrack_tcp_unreplied.sh \
conntrack_vrf.sh
LDLIBS = -lmnl
TEST_GEN_FILES = nf-queue

View File

@ -0,0 +1,219 @@
#!/bin/sh
# This script demonstrates interaction of conntrack and vrf.
# The vrf driver calls the netfilter hooks again, with oif/iif
# pointing at the VRF device.
#
# For ingress, this means first iteration has iifname of lower/real
# device. In this script, thats veth0.
# Second iteration is iifname set to vrf device, tvrf in this script.
#
# For egress, this is reversed: first iteration has the vrf device,
# second iteration is done with the lower/real/veth0 device.
#
# test_ct_zone_in demonstrates unexpected change of nftables
# behavior # caused by commit 09e856d54bda5f28 "vrf: Reset skb conntrack
# connection on VRF rcv"
#
# It was possible to assign conntrack zone to a packet (or mark it for
# `notracking`) in the prerouting chain before conntrack, based on real iif.
#
# After the change, the zone assignment is lost and the zone is assigned based
# on the VRF master interface (in case such a rule exists).
# assignment is lost. Instead, assignment based on the `iif` matching
# Thus it is impossible to distinguish packets based on the original
# interface.
#
# test_masquerade_vrf and test_masquerade_veth0 demonstrate the problem
# that was supposed to be fixed by the commit mentioned above to make sure
# that any fix to test case 1 won't break masquerade again.
ksft_skip=4
IP0=172.30.30.1
IP1=172.30.30.2
PFXL=30
ret=0
sfx=$(mktemp -u "XXXXXXXX")
ns0="ns0-$sfx"
ns1="ns1-$sfx"
cleanup()
{
ip netns pids $ns0 | xargs kill 2>/dev/null
ip netns pids $ns1 | xargs kill 2>/dev/null
ip netns del $ns0 $ns1
}
nft --version > /dev/null 2>&1
if [ $? -ne 0 ];then
echo "SKIP: Could not run test without nft tool"
exit $ksft_skip
fi
ip -Version > /dev/null 2>&1
if [ $? -ne 0 ];then
echo "SKIP: Could not run test without ip tool"
exit $ksft_skip
fi
ip netns add "$ns0"
if [ $? -ne 0 ];then
echo "SKIP: Could not create net namespace $ns0"
exit $ksft_skip
fi
ip netns add "$ns1"
trap cleanup EXIT
ip netns exec $ns0 sysctl -q -w net.ipv4.conf.default.rp_filter=0
ip netns exec $ns0 sysctl -q -w net.ipv4.conf.all.rp_filter=0
ip netns exec $ns0 sysctl -q -w net.ipv4.conf.all.rp_filter=0
ip link add veth0 netns "$ns0" type veth peer name veth0 netns "$ns1" > /dev/null 2>&1
if [ $? -ne 0 ];then
echo "SKIP: Could not add veth device"
exit $ksft_skip
fi
ip -net $ns0 li add tvrf type vrf table 9876
if [ $? -ne 0 ];then
echo "SKIP: Could not add vrf device"
exit $ksft_skip
fi
ip -net $ns0 li set lo up
ip -net $ns0 li set veth0 master tvrf
ip -net $ns0 li set tvrf up
ip -net $ns0 li set veth0 up
ip -net $ns1 li set veth0 up
ip -net $ns0 addr add $IP0/$PFXL dev veth0
ip -net $ns1 addr add $IP1/$PFXL dev veth0
ip netns exec $ns1 iperf3 -s > /dev/null 2>&1&
if [ $? -ne 0 ];then
echo "SKIP: Could not start iperf3"
exit $ksft_skip
fi
# test vrf ingress handling.
# The incoming connection should be placed in conntrack zone 1,
# as decided by the first iteration of the ruleset.
test_ct_zone_in()
{
ip netns exec $ns0 nft -f - <<EOF
table testct {
chain rawpre {
type filter hook prerouting priority raw;
iif { veth0, tvrf } counter meta nftrace set 1
iif veth0 counter ct zone set 1 counter return
iif tvrf counter ct zone set 2 counter return
ip protocol icmp counter
notrack counter
}
chain rawout {
type filter hook output priority raw;
oif veth0 counter ct zone set 1 counter return
oif tvrf counter ct zone set 2 counter return
notrack counter
}
}
EOF
ip netns exec $ns1 ping -W 1 -c 1 -I veth0 $IP0 > /dev/null
# should be in zone 1, not zone 2
count=$(ip netns exec $ns0 conntrack -L -s $IP1 -d $IP0 -p icmp --zone 1 2>/dev/null | wc -l)
if [ $count -eq 1 ]; then
echo "PASS: entry found in conntrack zone 1"
else
echo "FAIL: entry not found in conntrack zone 1"
count=$(ip netns exec $ns0 conntrack -L -s $IP1 -d $IP0 -p icmp --zone 2 2> /dev/null | wc -l)
if [ $count -eq 1 ]; then
echo "FAIL: entry found in zone 2 instead"
else
echo "FAIL: entry not in zone 1 or 2, dumping table"
ip netns exec $ns0 conntrack -L
ip netns exec $ns0 nft list ruleset
fi
fi
}
# add masq rule that gets evaluated w. outif set to vrf device.
# This tests the first iteration of the packet through conntrack,
# oifname is the vrf device.
test_masquerade_vrf()
{
ip netns exec $ns0 conntrack -F 2>/dev/null
ip netns exec $ns0 nft -f - <<EOF
flush ruleset
table ip nat {
chain postrouting {
type nat hook postrouting priority 0;
# NB: masquerade should always be combined with 'oif(name) bla',
# lack of this is intentional here, we want to exercise double-snat.
ip saddr 172.30.30.0/30 counter masquerade random
}
}
EOF
ip netns exec $ns0 ip vrf exec tvrf iperf3 -t 1 -c $IP1 >/dev/null
if [ $? -ne 0 ]; then
echo "FAIL: iperf3 connect failure with masquerade + sport rewrite on vrf device"
ret=1
return
fi
# must also check that nat table was evaluated on second (lower device) iteration.
ip netns exec $ns0 nft list table ip nat |grep -q 'counter packets 2'
if [ $? -eq 0 ]; then
echo "PASS: iperf3 connect with masquerade + sport rewrite on vrf device"
else
echo "FAIL: vrf masq rule has unexpected counter value"
ret=1
fi
}
# add masq rule that gets evaluated w. outif set to veth device.
# This tests the 2nd iteration of the packet through conntrack,
# oifname is the lower device (veth0 in this case).
test_masquerade_veth()
{
ip netns exec $ns0 conntrack -F 2>/dev/null
ip netns exec $ns0 nft -f - <<EOF
flush ruleset
table ip nat {
chain postrouting {
type nat hook postrouting priority 0;
meta oif veth0 ip saddr 172.30.30.0/30 counter masquerade random
}
}
EOF
ip netns exec $ns0 ip vrf exec tvrf iperf3 -t 1 -c $IP1 > /dev/null
if [ $? -ne 0 ]; then
echo "FAIL: iperf3 connect failure with masquerade + sport rewrite on veth device"
ret=1
return
fi
# must also check that nat table was evaluated on second (lower device) iteration.
ip netns exec $ns0 nft list table ip nat |grep -q 'counter packets 2'
if [ $? -eq 0 ]; then
echo "PASS: iperf3 connect with masquerade + sport rewrite on veth device"
else
echo "FAIL: vrf masq rule has unexpected counter value"
ret=1
fi
}
test_ct_zone_in
test_masquerade_vrf
test_masquerade_veth
exit $ret

View File

@ -759,19 +759,21 @@ test_port_shadow()
local result=""
local logmsg=""
echo ROUTER | ip netns exec "$ns0" nc -w 5 -u -l -p 1405 >/dev/null 2>&1 &
nc_r=$!
echo CLIENT | ip netns exec "$ns2" nc -w 5 -u -l -p 1405 >/dev/null 2>&1 &
nc_c=$!
# make shadow entry, from client (ns2), going to (ns1), port 41404, sport 1405.
echo "fake-entry" | ip netns exec "$ns2" nc -w 1 -p 1405 -u "$daddrc" 41404 > /dev/null
echo "fake-entry" | ip netns exec "$ns2" timeout 1 socat -u STDIN UDP:"$daddrc":41404,sourceport=1405
echo ROUTER | ip netns exec "$ns0" timeout 5 socat -u STDIN UDP4-LISTEN:1405 &
sc_r=$!
echo CLIENT | ip netns exec "$ns2" timeout 5 socat -u STDIN UDP4-LISTEN:1405,reuseport &
sc_c=$!
sleep 0.3
# ns1 tries to connect to ns0:1405. With default settings this should connect
# to client, it matches the conntrack entry created above.
result=$(echo "" | ip netns exec "$ns1" nc -w 1 -p 41404 -u "$daddrs" 1405)
result=$(echo "data" | ip netns exec "$ns1" timeout 1 socat - UDP:"$daddrs":1405,sourceport=41404)
if [ "$result" = "$expect" ] ;then
echo "PASS: portshadow test $test: got reply from ${expect}${logmsg}"
@ -780,7 +782,7 @@ test_port_shadow()
ret=1
fi
kill $nc_r $nc_c 2>/dev/null
kill $sc_r $sc_c 2>/dev/null
# flush udp entries for next test round, if any
ip netns exec "$ns0" conntrack -F >/dev/null 2>&1
@ -816,11 +818,10 @@ table $family raw {
chain prerouting {
type filter hook prerouting priority -300; policy accept;
meta iif veth0 udp dport 1405 notrack
udp dport 1405 notrack
}
chain output {
type filter hook output priority -300; policy accept;
udp sport 1405 notrack
meta oif veth0 udp sport 1405 notrack
}
}
EOF
@ -851,6 +852,18 @@ test_port_shadowing()
{
local family="ip"
conntrack -h >/dev/null 2>&1
if [ $? -ne 0 ];then
echo "SKIP: Could not run nat port shadowing test without conntrack tool"
return
fi
socat -h > /dev/null 2>&1
if [ $? -ne 0 ];then
echo "SKIP: Could not run nat port shadowing test without socat tool"
return
fi
ip netns exec "$ns0" sysctl net.ipv4.conf.veth0.forwarding=1 > /dev/null
ip netns exec "$ns0" sysctl net.ipv4.conf.veth1.forwarding=1 > /dev/null

View File

@ -16,6 +16,10 @@ timeout=4
cleanup()
{
ip netns pids ${ns1} | xargs kill 2>/dev/null
ip netns pids ${ns2} | xargs kill 2>/dev/null
ip netns pids ${nsrouter} | xargs kill 2>/dev/null
ip netns del ${ns1}
ip netns del ${ns2}
ip netns del ${nsrouter}
@ -332,6 +336,55 @@ EOF
echo "PASS: tcp via loopback and re-queueing"
}
test_icmp_vrf() {
ip -net $ns1 link add tvrf type vrf table 9876
if [ $? -ne 0 ];then
echo "SKIP: Could not add vrf device"
return
fi
ip -net $ns1 li set eth0 master tvrf
ip -net $ns1 li set tvrf up
ip -net $ns1 route add 10.0.2.0/24 via 10.0.1.1 dev eth0 table 9876
ip netns exec ${ns1} nft -f /dev/stdin <<EOF
flush ruleset
table inet filter {
chain output {
type filter hook output priority 0; policy accept;
meta oifname "tvrf" icmp type echo-request counter queue num 1
meta oifname "eth0" icmp type echo-request counter queue num 1
}
chain post {
type filter hook postrouting priority 0; policy accept;
meta oifname "tvrf" icmp type echo-request counter queue num 1
meta oifname "eth0" icmp type echo-request counter queue num 1
}
}
EOF
ip netns exec ${ns1} ./nf-queue -q 1 -t $timeout &
local nfqpid=$!
sleep 1
ip netns exec ${ns1} ip vrf exec tvrf ping -c 1 10.0.2.99 > /dev/null
for n in output post; do
for d in tvrf eth0; do
ip netns exec ${ns1} nft list chain inet filter $n | grep -q "oifname \"$d\" icmp type echo-request counter packets 1"
if [ $? -ne 0 ] ; then
echo "FAIL: chain $n: icmp packet counter mismatch for device $d" 1>&2
ip netns exec ${ns1} nft list ruleset
ret=1
return
fi
done
done
wait $nfqpid
[ $? -eq 0 ] && echo "PASS: icmp+nfqueue via vrf"
wait 2>/dev/null
}
ip netns exec ${nsrouter} sysctl net.ipv6.conf.all.forwarding=1 > /dev/null
ip netns exec ${nsrouter} sysctl net.ipv4.conf.veth0.forwarding=1 > /dev/null
ip netns exec ${nsrouter} sysctl net.ipv4.conf.veth1.forwarding=1 > /dev/null
@ -372,5 +425,6 @@ test_queue 20
test_tcp_forward
test_tcp_localhost
test_tcp_localhost_requeue
test_icmp_vrf
exit $ret

View File

@ -68,7 +68,7 @@
"cmdUnderTest": "$TC action add action bpf object-file $EBPFDIR/action.o section action-ok index 667",
"expExitCode": "0",
"verifyCmd": "$TC action get action bpf index 667",
"matchPattern": "action order [0-9]*: bpf action.o:\\[action-ok\\] id [0-9]* tag [0-9a-f]{16}( jited)? default-action pipe.*index 667 ref",
"matchPattern": "action order [0-9]*: bpf action.o:\\[action-ok\\] id [0-9].* tag [0-9a-f]{16}( jited)? default-action pipe.*index 667 ref",
"matchCount": "1",
"teardown": [
"$TC action flush action bpf"

View File

@ -15,7 +15,7 @@
"cmdUnderTest": "$TC qdisc add dev $ETH root handle 1: mq",
"expExitCode": "0",
"verifyCmd": "$TC qdisc show dev $ETH",
"matchPattern": "qdisc pfifo_fast 0: parent 1:[1-4] bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1",
"matchPattern": "qdisc [a-zA-Z0-9_]+ 0: parent 1:[1-4]",
"matchCount": "4",
"teardown": [
"echo \"1\" > /sys/bus/netdevsim/del_device"
@ -37,7 +37,7 @@
"cmdUnderTest": "$TC qdisc add dev $ETH root handle 1: mq",
"expExitCode": "0",
"verifyCmd": "$TC qdisc show dev $ETH",
"matchPattern": "qdisc pfifo_fast 0: parent 1:[1-9,a-f][0-9,a-f]{0,2} bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1",
"matchPattern": "qdisc [a-zA-Z0-9_]+ 0: parent 1:[1-9,a-f][0-9,a-f]{0,2}",
"matchCount": "256",
"teardown": [
"echo \"1\" > /sys/bus/netdevsim/del_device"
@ -60,7 +60,7 @@
"cmdUnderTest": "$TC qdisc add dev $ETH root handle 1: mq",
"expExitCode": "2",
"verifyCmd": "$TC qdisc show dev $ETH",
"matchPattern": "qdisc pfifo_fast 0: parent 1:[1-4] bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1",
"matchPattern": "qdisc [a-zA-Z0-9_]+ 0: parent 1:[1-4]",
"matchCount": "4",
"teardown": [
"echo \"1\" > /sys/bus/netdevsim/del_device"
@ -82,7 +82,7 @@
"cmdUnderTest": "$TC qdisc del dev $ETH root handle 1: mq",
"expExitCode": "2",
"verifyCmd": "$TC qdisc show dev $ETH",
"matchPattern": "qdisc pfifo_fast 0: parent 1:[1-4] bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1",
"matchPattern": "qdisc [a-zA-Z0-9_]+ 0: parent 1:[1-4]",
"matchCount": "0",
"teardown": [
"echo \"1\" > /sys/bus/netdevsim/del_device"
@ -106,7 +106,7 @@
"cmdUnderTest": "$TC qdisc del dev $ETH root handle 1: mq",
"expExitCode": "2",
"verifyCmd": "$TC qdisc show dev $ETH",
"matchPattern": "qdisc pfifo_fast 0: parent 1:[1-4] bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1",
"matchPattern": "qdisc [a-zA-Z0-9_]+ 0: parent 1:[1-4]",
"matchCount": "0",
"teardown": [
"echo \"1\" > /sys/bus/netdevsim/del_device"
@ -128,7 +128,7 @@
"cmdUnderTest": "$TC qdisc add dev $ETH root handle 1: mq",
"expExitCode": "2",
"verifyCmd": "$TC qdisc show dev $ETH",
"matchPattern": "qdisc pfifo_fast 0: parent 1:[1-4] bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1",
"matchPattern": "qdisc [a-zA-Z0-9_]+ 0: parent 1:[1-4]",
"matchCount": "0",
"teardown": [
"echo \"1\" > /sys/bus/netdevsim/del_device"