Including fixes from bpf and netfilter.

Current release - regressions:
 
   - af_unix: fix another unix GC hangup
 
 Previous releases - regressions:
 
   - core: fix a possible AF_UNIX deadlock
 
   - bpf: fix NULL pointer dereference in sk_psock_verdict_data_ready()
 
   - netfilter: nft_flow_offload: release dst in case direct xmit path is used
 
   - bridge: switchdev: ensure MDB events are delivered exactly once
 
   - l2tp: pass correct message length to ip6_append_data
 
   - dccp/tcp: unhash sk from ehash for tb2 alloc failure after check_estalblished()
 
   - tls: fixes for record type handling with PEEK
 
   - devlink: fix possible use-after-free and memory leaks in devlink_init()
 
 Previous releases - always broken:
 
   - bpf: fix an oops when attempting to read the vsyscall
   	 page through bpf_probe_read_kernel
 
   - sched: act_mirred: use the backlog for mirred ingress
 
   - netfilter: nft_flow_offload: fix dst refcount underflow
 
   - ipv6: sr: fix possible use-after-free and null-ptr-deref
 
   - mptcp: fix several data races
 
   - phonet: take correct lock to peek at the RX queue
 
 Misc:
 
   - handful of fixes and reliability improvements for selftests
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmXXKMMSHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkmgAQAIV2NAVEvHVBtnm0Df9PuCcHQx6i9veS
 tGxOZMVwb5ePFI+dpiNyyn61koEiRuFLOm66pfJAuT5j5z6m4PEFfPZgtiVpCHVK
 4sz4UD4+jVLmYijv+YlWkPU3RWR0RejSkDbXwY5Y9Io/DWHhA2iq5IyMy2MncUPY
 dUc12ddEsYRH60Kmm2/96FcdbHw9Y64mDC8tIeIlCAQfng4U98EXJbCq9WXsPPlW
 vjwSKwRG76QGDugss9XkatQ7Bsva1qTobFGDOvBMQpMt+dr81pTGVi0c1h/drzvI
 EJaDO8jJU3Xy0pQ80beboCJ1KlVCYhWSmwlBMZUA1f0lA2m3U5UFEtHA5hHKs3Mi
 jNe/sgKXzThrro0fishAXbzrro2QDhCG3Vm4PRlOGexIyy+n0gIp1lHwEY1p2vX9
 RJPdt1e3xt/5NYRv6l2GVQYFi8Wd0endgzCdJeXk0OWQFLFtnxhG6ejpgxtgN0fp
 CzKU6orFpsddQtcEOdIzKMUA3CXYWAdQPXOE5Ptjoz3MXZsQqtMm3vN4and8jJ19
 8/VLsCNPp11bSRTmNY3Xt85e+gjIA2mRwgRo+ieL6b1x2AqNeVizlr6IZWYQ4TdG
 rUdlEX0IVmov80TSeQoWgtzTO7xMER+qN6FxAs3pQoUFjtol3pEURq9FQ2QZ8jW4
 5rKpNBrjKxdk
 =eUOc
 -----END PGP SIGNATURE-----

Merge tag 'net-6.8.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf and netfilter.

  Current release - regressions:

   - af_unix: fix another unix GC hangup

  Previous releases - regressions:

   - core: fix a possible AF_UNIX deadlock

   - bpf: fix NULL pointer dereference in sk_psock_verdict_data_ready()

   - netfilter: nft_flow_offload: release dst in case direct xmit path
     is used

   - bridge: switchdev: ensure MDB events are delivered exactly once

   - l2tp: pass correct message length to ip6_append_data

   - dccp/tcp: unhash sk from ehash for tb2 alloc failure after
     check_estalblished()

   - tls: fixes for record type handling with PEEK

   - devlink: fix possible use-after-free and memory leaks in
     devlink_init()

  Previous releases - always broken:

   - bpf: fix an oops when attempting to read the vsyscall page through
     bpf_probe_read_kernel

   - sched: act_mirred: use the backlog for mirred ingress

   - netfilter: nft_flow_offload: fix dst refcount underflow

   - ipv6: sr: fix possible use-after-free and null-ptr-deref

   - mptcp: fix several data races

   - phonet: take correct lock to peek at the RX queue

  Misc:

   - handful of fixes and reliability improvements for selftests"

* tag 'net-6.8.0-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (72 commits)
  l2tp: pass correct message length to ip6_append_data
  net: phy: realtek: Fix rtl8211f_config_init() for RTL8211F(D)(I)-VD-CG PHY
  selftests: ioam: refactoring to align with the fix
  Fix write to cloned skb in ipv6_hop_ioam()
  phonet/pep: fix racy skb_queue_empty() use
  phonet: take correct lock to peek at the RX queue
  net: sparx5: Add spinlock for frame transmission from CPU
  net/sched: flower: Add lock protection when remove filter handle
  devlink: fix port dump cmd type
  net: stmmac: Fix EST offset for dwmac 5.10
  tools: ynl: don't leak mcast_groups on init error
  tools: ynl: make sure we always pass yarg to mnl_cb_run
  net: mctp: put sock on tag allocation failure
  netfilter: nf_tables: use kzalloc for hook allocation
  netfilter: nf_tables: register hooks last when adding new chain/flowtable
  netfilter: nft_flow_offload: release dst in case direct xmit path is used
  netfilter: nft_flow_offload: reset dst in route object after setting up flow
  netfilter: nf_tables: set dormant flag on hook register failure
  selftests: tls: add test for peeking past a record of a different type
  selftests: tls: add test for merging of same-type control messages
  ...
This commit is contained in:
Linus Torvalds 2024-02-22 09:57:58 -08:00
commit 6714ebb922
75 changed files with 870 additions and 378 deletions

View File

@ -431,7 +431,7 @@ patchwork checks
Checks in patchwork are mostly simple wrappers around existing kernel Checks in patchwork are mostly simple wrappers around existing kernel
scripts, the sources are available at: scripts, the sources are available at:
https://github.com/kuba-moo/nipa/tree/master/tests https://github.com/linux-netdev/nipa/tree/master/tests
**Do not** post your patches just to run them through the checks. **Do not** post your patches just to run them through the checks.
You must ensure that your patches are ready by testing them locally You must ensure that your patches are ready by testing them locally

View File

@ -15242,6 +15242,8 @@ F: Documentation/networking/
F: Documentation/networking/net_cachelines/ F: Documentation/networking/net_cachelines/
F: Documentation/process/maintainer-netdev.rst F: Documentation/process/maintainer-netdev.rst
F: Documentation/userspace-api/netlink/ F: Documentation/userspace-api/netlink/
F: include/linux/framer/framer-provider.h
F: include/linux/framer/framer.h
F: include/linux/in.h F: include/linux/in.h
F: include/linux/indirect_call_wrapper.h F: include/linux/indirect_call_wrapper.h
F: include/linux/net.h F: include/linux/net.h

View File

@ -4,6 +4,7 @@
#include <linux/seqlock.h> #include <linux/seqlock.h>
#include <uapi/asm/vsyscall.h> #include <uapi/asm/vsyscall.h>
#include <asm/page_types.h>
#ifdef CONFIG_X86_VSYSCALL_EMULATION #ifdef CONFIG_X86_VSYSCALL_EMULATION
extern void map_vsyscall(void); extern void map_vsyscall(void);
@ -24,4 +25,13 @@ static inline bool emulate_vsyscall(unsigned long error_code,
} }
#endif #endif
/*
* The (legacy) vsyscall page is the long page in the kernel portion
* of the address space that has user-accessible permissions.
*/
static inline bool is_vsyscall_vaddr(unsigned long vaddr)
{
return unlikely((vaddr & PAGE_MASK) == VSYSCALL_ADDR);
}
#endif /* _ASM_X86_VSYSCALL_H */ #endif /* _ASM_X86_VSYSCALL_H */

View File

@ -798,15 +798,6 @@ show_signal_msg(struct pt_regs *regs, unsigned long error_code,
show_opcodes(regs, loglvl); show_opcodes(regs, loglvl);
} }
/*
* The (legacy) vsyscall page is the long page in the kernel portion
* of the address space that has user-accessible permissions.
*/
static bool is_vsyscall_vaddr(unsigned long vaddr)
{
return unlikely((vaddr & PAGE_MASK) == VSYSCALL_ADDR);
}
static void static void
__bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code, __bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
unsigned long address, u32 pkey, int si_code) unsigned long address, u32 pkey, int si_code)

View File

@ -3,6 +3,8 @@
#include <linux/uaccess.h> #include <linux/uaccess.h>
#include <linux/kernel.h> #include <linux/kernel.h>
#include <asm/vsyscall.h>
#ifdef CONFIG_X86_64 #ifdef CONFIG_X86_64
bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size) bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
{ {
@ -15,6 +17,14 @@ bool copy_from_kernel_nofault_allowed(const void *unsafe_src, size_t size)
if (vaddr < TASK_SIZE_MAX + PAGE_SIZE) if (vaddr < TASK_SIZE_MAX + PAGE_SIZE)
return false; return false;
/*
* Reading from the vsyscall page may cause an unhandled fault in
* certain cases. Though it is at an address above TASK_SIZE_MAX, it is
* usually considered as a user space address.
*/
if (is_vsyscall_vaddr(vaddr))
return false;
/* /*
* Allow everything during early boot before 'x86_virt_bits' * Allow everything during early boot before 'x86_virt_bits'
* is initialized. Needed for instruction decoding in early * is initialized. Needed for instruction decoding in early

View File

@ -7,6 +7,7 @@ config NET_VENDOR_ADI
bool "Analog Devices devices" bool "Analog Devices devices"
default y default y
depends on SPI depends on SPI
select PHYLIB
help help
If you have a network (Ethernet) card belonging to this class, say Y. If you have a network (Ethernet) card belonging to this class, say Y.

View File

@ -535,9 +535,6 @@ int bcmasp_netfilt_get_all_active(struct bcmasp_intf *intf, u32 *rule_locs,
int j = 0, i; int j = 0, i;
for (i = 0; i < NUM_NET_FILTERS; i++) { for (i = 0; i < NUM_NET_FILTERS; i++) {
if (j == *rule_cnt)
return -EMSGSIZE;
if (!priv->net_filters[i].claimed || if (!priv->net_filters[i].claimed ||
priv->net_filters[i].port != intf->port) priv->net_filters[i].port != intf->port)
continue; continue;
@ -547,6 +544,9 @@ int bcmasp_netfilt_get_all_active(struct bcmasp_intf *intf, u32 *rule_locs,
priv->net_filters[i - 1].wake_filter) priv->net_filters[i - 1].wake_filter)
continue; continue;
if (j == *rule_cnt)
return -EMSGSIZE;
rule_locs[j++] = priv->net_filters[i].fs.location; rule_locs[j++] = priv->net_filters[i].fs.location;
} }

View File

@ -1050,6 +1050,9 @@ static int bcmasp_netif_init(struct net_device *dev, bool phy_connect)
netdev_err(dev, "could not attach to PHY\n"); netdev_err(dev, "could not attach to PHY\n");
goto err_phy_disable; goto err_phy_disable;
} }
/* Indicate that the MAC is responsible for PHY PM */
phydev->mac_managed_pm = true;
} else if (!intf->wolopts) { } else if (!intf->wolopts) {
ret = phy_resume(dev->phydev); ret = phy_resume(dev->phydev);
if (ret) if (ret)

View File

@ -49,7 +49,8 @@ int vic_provinfo_add_tlv(struct vic_provinfo *vp, u16 type, u16 length,
tlv->type = htons(type); tlv->type = htons(type);
tlv->length = htons(length); tlv->length = htons(length);
memcpy(tlv->value, value, length); unsafe_memcpy(tlv->value, value, length,
/* Flexible array of flexible arrays */);
vp->num_tlvs = htonl(ntohl(vp->num_tlvs) + 1); vp->num_tlvs = htonl(ntohl(vp->num_tlvs) + 1);
vp->length = htonl(ntohl(vp->length) + vp->length = htonl(ntohl(vp->length) +

View File

@ -415,6 +415,10 @@ static void npc_fixup_vf_rule(struct rvu *rvu, struct npc_mcam *mcam,
return; return;
} }
/* AF modifies given action iff PF/VF has requested for it */
if ((entry->action & 0xFULL) != NIX_RX_ACTION_DEFAULT)
return;
/* copy VF default entry action to the VF mcam entry */ /* copy VF default entry action to the VF mcam entry */
rx_action = npc_get_default_entry_action(rvu, mcam, blkaddr, rx_action = npc_get_default_entry_action(rvu, mcam, blkaddr,
target_func); target_func);

View File

@ -757,6 +757,7 @@ static int mchp_sparx5_probe(struct platform_device *pdev)
platform_set_drvdata(pdev, sparx5); platform_set_drvdata(pdev, sparx5);
sparx5->pdev = pdev; sparx5->pdev = pdev;
sparx5->dev = &pdev->dev; sparx5->dev = &pdev->dev;
spin_lock_init(&sparx5->tx_lock);
/* Do switch core reset if available */ /* Do switch core reset if available */
reset = devm_reset_control_get_optional_shared(&pdev->dev, "switch"); reset = devm_reset_control_get_optional_shared(&pdev->dev, "switch");

View File

@ -280,6 +280,7 @@ struct sparx5 {
int xtr_irq; int xtr_irq;
/* Frame DMA */ /* Frame DMA */
int fdma_irq; int fdma_irq;
spinlock_t tx_lock; /* lock for frame transmission */
struct sparx5_rx rx; struct sparx5_rx rx;
struct sparx5_tx tx; struct sparx5_tx tx;
/* PTP */ /* PTP */

View File

@ -244,10 +244,12 @@ netdev_tx_t sparx5_port_xmit_impl(struct sk_buff *skb, struct net_device *dev)
} }
skb_tx_timestamp(skb); skb_tx_timestamp(skb);
spin_lock(&sparx5->tx_lock);
if (sparx5->fdma_irq > 0) if (sparx5->fdma_irq > 0)
ret = sparx5_fdma_xmit(sparx5, ifh, skb); ret = sparx5_fdma_xmit(sparx5, ifh, skb);
else else
ret = sparx5_inject(sparx5, ifh, skb, dev); ret = sparx5_inject(sparx5, ifh, skb, dev);
spin_unlock(&sparx5->tx_lock);
if (ret == -EBUSY) if (ret == -EBUSY)
goto busy; goto busy;

View File

@ -223,7 +223,7 @@ static void ionic_clear_pci(struct ionic *ionic)
ionic_unmap_bars(ionic); ionic_unmap_bars(ionic);
pci_release_regions(ionic->pdev); pci_release_regions(ionic->pdev);
if (atomic_read(&ionic->pdev->enable_cnt) > 0) if (pci_is_enabled(ionic->pdev))
pci_disable_device(ionic->pdev); pci_disable_device(ionic->pdev);
} }

View File

@ -224,7 +224,7 @@ static const struct stmmac_hwif_entry {
.regs = { .regs = {
.ptp_off = PTP_GMAC4_OFFSET, .ptp_off = PTP_GMAC4_OFFSET,
.mmc_off = MMC_GMAC4_OFFSET, .mmc_off = MMC_GMAC4_OFFSET,
.est_off = EST_XGMAC_OFFSET, .est_off = EST_GMAC4_OFFSET,
}, },
.desc = &dwmac4_desc_ops, .desc = &dwmac4_desc_ops,
.dma = &dwmac410_dma_ops, .dma = &dwmac410_dma_ops,

View File

@ -6059,11 +6059,6 @@ static irqreturn_t stmmac_mac_interrupt(int irq, void *dev_id)
struct net_device *dev = (struct net_device *)dev_id; struct net_device *dev = (struct net_device *)dev_id;
struct stmmac_priv *priv = netdev_priv(dev); struct stmmac_priv *priv = netdev_priv(dev);
if (unlikely(!dev)) {
netdev_err(priv->dev, "%s: invalid dev pointer\n", __func__);
return IRQ_NONE;
}
/* Check if adapter is up */ /* Check if adapter is up */
if (test_bit(STMMAC_DOWN, &priv->state)) if (test_bit(STMMAC_DOWN, &priv->state))
return IRQ_HANDLED; return IRQ_HANDLED;
@ -6079,11 +6074,6 @@ static irqreturn_t stmmac_safety_interrupt(int irq, void *dev_id)
struct net_device *dev = (struct net_device *)dev_id; struct net_device *dev = (struct net_device *)dev_id;
struct stmmac_priv *priv = netdev_priv(dev); struct stmmac_priv *priv = netdev_priv(dev);
if (unlikely(!dev)) {
netdev_err(priv->dev, "%s: invalid dev pointer\n", __func__);
return IRQ_NONE;
}
/* Check if adapter is up */ /* Check if adapter is up */
if (test_bit(STMMAC_DOWN, &priv->state)) if (test_bit(STMMAC_DOWN, &priv->state))
return IRQ_HANDLED; return IRQ_HANDLED;
@ -6105,11 +6095,6 @@ static irqreturn_t stmmac_msi_intr_tx(int irq, void *data)
dma_conf = container_of(tx_q, struct stmmac_dma_conf, tx_queue[chan]); dma_conf = container_of(tx_q, struct stmmac_dma_conf, tx_queue[chan]);
priv = container_of(dma_conf, struct stmmac_priv, dma_conf); priv = container_of(dma_conf, struct stmmac_priv, dma_conf);
if (unlikely(!data)) {
netdev_err(priv->dev, "%s: invalid dev pointer\n", __func__);
return IRQ_NONE;
}
/* Check if adapter is up */ /* Check if adapter is up */
if (test_bit(STMMAC_DOWN, &priv->state)) if (test_bit(STMMAC_DOWN, &priv->state))
return IRQ_HANDLED; return IRQ_HANDLED;
@ -6136,11 +6121,6 @@ static irqreturn_t stmmac_msi_intr_rx(int irq, void *data)
dma_conf = container_of(rx_q, struct stmmac_dma_conf, rx_queue[chan]); dma_conf = container_of(rx_q, struct stmmac_dma_conf, rx_queue[chan]);
priv = container_of(dma_conf, struct stmmac_priv, dma_conf); priv = container_of(dma_conf, struct stmmac_priv, dma_conf);
if (unlikely(!data)) {
netdev_err(priv->dev, "%s: invalid dev pointer\n", __func__);
return IRQ_NONE;
}
/* Check if adapter is up */ /* Check if adapter is up */
if (test_bit(STMMAC_DOWN, &priv->state)) if (test_bit(STMMAC_DOWN, &priv->state))
return IRQ_HANDLED; return IRQ_HANDLED;

View File

@ -1907,20 +1907,20 @@ static int __init gtp_init(void)
if (err < 0) if (err < 0)
goto error_out; goto error_out;
err = genl_register_family(&gtp_genl_family); err = register_pernet_subsys(&gtp_net_ops);
if (err < 0) if (err < 0)
goto unreg_rtnl_link; goto unreg_rtnl_link;
err = register_pernet_subsys(&gtp_net_ops); err = genl_register_family(&gtp_genl_family);
if (err < 0) if (err < 0)
goto unreg_genl_family; goto unreg_pernet_subsys;
pr_info("GTP module loaded (pdp ctx size %zd bytes)\n", pr_info("GTP module loaded (pdp ctx size %zd bytes)\n",
sizeof(struct pdp_ctx)); sizeof(struct pdp_ctx));
return 0; return 0;
unreg_genl_family: unreg_pernet_subsys:
genl_unregister_family(&gtp_genl_family); unregister_pernet_subsys(&gtp_net_ops);
unreg_rtnl_link: unreg_rtnl_link:
rtnl_link_unregister(&gtp_link_ops); rtnl_link_unregister(&gtp_link_ops);
error_out: error_out:

View File

@ -212,7 +212,7 @@ void ipa_interrupt_suspend_clear_all(struct ipa_interrupt *interrupt)
u32 unit_count; u32 unit_count;
u32 unit; u32 unit;
unit_count = roundup(ipa->endpoint_count, 32); unit_count = DIV_ROUND_UP(ipa->endpoint_count, 32);
for (unit = 0; unit < unit_count; unit++) { for (unit = 0; unit < unit_count; unit++) {
const struct reg *reg; const struct reg *reg;
u32 val; u32 val;

View File

@ -421,9 +421,11 @@ static int rtl8211f_config_init(struct phy_device *phydev)
ERR_PTR(ret)); ERR_PTR(ret));
return ret; return ret;
} }
return genphy_soft_reset(phydev);
} }
return genphy_soft_reset(phydev); return 0;
} }
static int rtl821x_suspend(struct phy_device *phydev) static int rtl821x_suspend(struct phy_device *phydev)

View File

@ -276,7 +276,7 @@ nf_flow_table_offload_del_cb(struct nf_flowtable *flow_table,
} }
void flow_offload_route_init(struct flow_offload *flow, void flow_offload_route_init(struct flow_offload *flow,
const struct nf_flow_route *route); struct nf_flow_route *route);
int flow_offload_add(struct nf_flowtable *flow_table, struct flow_offload *flow); int flow_offload_add(struct nf_flowtable *flow_table, struct flow_offload *flow);
void flow_offload_refresh(struct nf_flowtable *flow_table, void flow_offload_refresh(struct nf_flowtable *flow_table,

View File

@ -308,6 +308,9 @@ void switchdev_deferred_process(void);
int switchdev_port_attr_set(struct net_device *dev, int switchdev_port_attr_set(struct net_device *dev,
const struct switchdev_attr *attr, const struct switchdev_attr *attr,
struct netlink_ext_ack *extack); struct netlink_ext_ack *extack);
bool switchdev_port_obj_act_is_deferred(struct net_device *dev,
enum switchdev_notifier_type nt,
const struct switchdev_obj *obj);
int switchdev_port_obj_add(struct net_device *dev, int switchdev_port_obj_add(struct net_device *dev,
const struct switchdev_obj *obj, const struct switchdev_obj *obj,
struct netlink_ext_ack *extack); struct netlink_ext_ack *extack);

View File

@ -2506,7 +2506,7 @@ struct tcp_ulp_ops {
/* cleanup ulp */ /* cleanup ulp */
void (*release)(struct sock *sk); void (*release)(struct sock *sk);
/* diagnostic */ /* diagnostic */
int (*get_info)(const struct sock *sk, struct sk_buff *skb); int (*get_info)(struct sock *sk, struct sk_buff *skb);
size_t (*get_info_size)(const struct sock *sk); size_t (*get_info_size)(const struct sock *sk);
/* clone ulp */ /* clone ulp */
void (*clone)(const struct request_sock *req, struct sock *newsk, void (*clone)(const struct request_sock *req, struct sock *newsk,

View File

@ -1101,6 +1101,7 @@ struct bpf_hrtimer {
struct bpf_prog *prog; struct bpf_prog *prog;
void __rcu *callback_fn; void __rcu *callback_fn;
void *value; void *value;
struct rcu_head rcu;
}; };
/* the actual struct hidden inside uapi struct bpf_timer */ /* the actual struct hidden inside uapi struct bpf_timer */
@ -1332,6 +1333,7 @@ BPF_CALL_1(bpf_timer_cancel, struct bpf_timer_kern *, timer)
if (in_nmi()) if (in_nmi())
return -EOPNOTSUPP; return -EOPNOTSUPP;
rcu_read_lock();
__bpf_spin_lock_irqsave(&timer->lock); __bpf_spin_lock_irqsave(&timer->lock);
t = timer->timer; t = timer->timer;
if (!t) { if (!t) {
@ -1353,6 +1355,7 @@ out:
* if it was running. * if it was running.
*/ */
ret = ret ?: hrtimer_cancel(&t->timer); ret = ret ?: hrtimer_cancel(&t->timer);
rcu_read_unlock();
return ret; return ret;
} }
@ -1407,7 +1410,7 @@ out:
*/ */
if (this_cpu_read(hrtimer_running) != t) if (this_cpu_read(hrtimer_running) != t)
hrtimer_cancel(&t->timer); hrtimer_cancel(&t->timer);
kfree(t); kfree_rcu(t, rcu);
} }
BPF_CALL_2(bpf_kptr_xchg, void *, map_value, void *, ptr) BPF_CALL_2(bpf_kptr_xchg, void *, map_value, void *, ptr)

View File

@ -978,6 +978,8 @@ __bpf_kfunc int bpf_iter_task_new(struct bpf_iter_task *it,
BUILD_BUG_ON(__alignof__(struct bpf_iter_task_kern) != BUILD_BUG_ON(__alignof__(struct bpf_iter_task_kern) !=
__alignof__(struct bpf_iter_task)); __alignof__(struct bpf_iter_task));
kit->pos = NULL;
switch (flags) { switch (flags) {
case BPF_TASK_ITER_ALL_THREADS: case BPF_TASK_ITER_ALL_THREADS:
case BPF_TASK_ITER_ALL_PROCS: case BPF_TASK_ITER_ALL_PROCS:

View File

@ -5227,7 +5227,9 @@ BTF_ID(struct, prog_test_ref_kfunc)
#ifdef CONFIG_CGROUPS #ifdef CONFIG_CGROUPS
BTF_ID(struct, cgroup) BTF_ID(struct, cgroup)
#endif #endif
#ifdef CONFIG_BPF_JIT
BTF_ID(struct, bpf_cpumask) BTF_ID(struct, bpf_cpumask)
#endif
BTF_ID(struct, task_struct) BTF_ID(struct, task_struct)
BTF_SET_END(rcu_protected_types) BTF_SET_END(rcu_protected_types)

View File

@ -595,21 +595,40 @@ br_switchdev_mdb_replay_one(struct notifier_block *nb, struct net_device *dev,
} }
static int br_switchdev_mdb_queue_one(struct list_head *mdb_list, static int br_switchdev_mdb_queue_one(struct list_head *mdb_list,
struct net_device *dev,
unsigned long action,
enum switchdev_obj_id id, enum switchdev_obj_id id,
const struct net_bridge_mdb_entry *mp, const struct net_bridge_mdb_entry *mp,
struct net_device *orig_dev) struct net_device *orig_dev)
{ {
struct switchdev_obj_port_mdb *mdb; struct switchdev_obj_port_mdb mdb = {
.obj = {
.id = id,
.orig_dev = orig_dev,
},
};
struct switchdev_obj_port_mdb *pmdb;
mdb = kzalloc(sizeof(*mdb), GFP_ATOMIC); br_switchdev_mdb_populate(&mdb, mp);
if (!mdb)
if (action == SWITCHDEV_PORT_OBJ_ADD &&
switchdev_port_obj_act_is_deferred(dev, action, &mdb.obj)) {
/* This event is already in the deferred queue of
* events, so this replay must be elided, lest the
* driver receives duplicate events for it. This can
* only happen when replaying additions, since
* modifications are always immediately visible in
* br->mdb_list, whereas actual event delivery may be
* delayed.
*/
return 0;
}
pmdb = kmemdup(&mdb, sizeof(mdb), GFP_ATOMIC);
if (!pmdb)
return -ENOMEM; return -ENOMEM;
mdb->obj.id = id; list_add_tail(&pmdb->obj.list, mdb_list);
mdb->obj.orig_dev = orig_dev;
br_switchdev_mdb_populate(mdb, mp);
list_add_tail(&mdb->obj.list, mdb_list);
return 0; return 0;
} }
@ -677,51 +696,50 @@ br_switchdev_mdb_replay(struct net_device *br_dev, struct net_device *dev,
if (!br_opt_get(br, BROPT_MULTICAST_ENABLED)) if (!br_opt_get(br, BROPT_MULTICAST_ENABLED))
return 0; return 0;
/* We cannot walk over br->mdb_list protected just by the rtnl_mutex, if (adding)
* because the write-side protection is br->multicast_lock. But we action = SWITCHDEV_PORT_OBJ_ADD;
* need to emulate the [ blocking ] calling context of a regular else
* switchdev event, so since both br->multicast_lock and RCU read side action = SWITCHDEV_PORT_OBJ_DEL;
* critical sections are atomic, we have no choice but to pick the RCU
* read side lock, queue up all our events, leave the critical section
* and notify switchdev from blocking context.
*/
rcu_read_lock();
hlist_for_each_entry_rcu(mp, &br->mdb_list, mdb_node) { /* br_switchdev_mdb_queue_one() will take care to not queue a
* replay of an event that is already pending in the switchdev
* deferred queue. In order to safely determine that, there
* must be no new deferred MDB notifications enqueued for the
* duration of the MDB scan. Therefore, grab the write-side
* lock to avoid racing with any concurrent IGMP/MLD snooping.
*/
spin_lock_bh(&br->multicast_lock);
hlist_for_each_entry(mp, &br->mdb_list, mdb_node) {
struct net_bridge_port_group __rcu * const *pp; struct net_bridge_port_group __rcu * const *pp;
const struct net_bridge_port_group *p; const struct net_bridge_port_group *p;
if (mp->host_joined) { if (mp->host_joined) {
err = br_switchdev_mdb_queue_one(&mdb_list, err = br_switchdev_mdb_queue_one(&mdb_list, dev, action,
SWITCHDEV_OBJ_ID_HOST_MDB, SWITCHDEV_OBJ_ID_HOST_MDB,
mp, br_dev); mp, br_dev);
if (err) { if (err) {
rcu_read_unlock(); spin_unlock_bh(&br->multicast_lock);
goto out_free_mdb; goto out_free_mdb;
} }
} }
for (pp = &mp->ports; (p = rcu_dereference(*pp)) != NULL; for (pp = &mp->ports; (p = mlock_dereference(*pp, br)) != NULL;
pp = &p->next) { pp = &p->next) {
if (p->key.port->dev != dev) if (p->key.port->dev != dev)
continue; continue;
err = br_switchdev_mdb_queue_one(&mdb_list, err = br_switchdev_mdb_queue_one(&mdb_list, dev, action,
SWITCHDEV_OBJ_ID_PORT_MDB, SWITCHDEV_OBJ_ID_PORT_MDB,
mp, dev); mp, dev);
if (err) { if (err) {
rcu_read_unlock(); spin_unlock_bh(&br->multicast_lock);
goto out_free_mdb; goto out_free_mdb;
} }
} }
} }
rcu_read_unlock(); spin_unlock_bh(&br->multicast_lock);
if (adding)
action = SWITCHDEV_PORT_OBJ_ADD;
else
action = SWITCHDEV_PORT_OBJ_DEL;
list_for_each_entry(obj, &mdb_list, list) { list_for_each_entry(obj, &mdb_list, list) {
err = br_switchdev_mdb_replay_one(nb, dev, err = br_switchdev_mdb_replay_one(nb, dev,
@ -786,6 +804,16 @@ static void nbp_switchdev_unsync_objs(struct net_bridge_port *p,
br_switchdev_mdb_replay(br_dev, dev, ctx, false, blocking_nb, NULL); br_switchdev_mdb_replay(br_dev, dev, ctx, false, blocking_nb, NULL);
br_switchdev_vlan_replay(br_dev, ctx, false, blocking_nb, NULL); br_switchdev_vlan_replay(br_dev, ctx, false, blocking_nb, NULL);
/* Make sure that the device leaving this bridge has seen all
* relevant events before it is disassociated. In the normal
* case, when the device is directly attached to the bridge,
* this is covered by del_nbp(). If the association was indirect
* however, e.g. via a team or bond, and the device is leaving
* that intermediate device, then the bridge port remains in
* place.
*/
switchdev_deferred_process();
} }
/* Let the bridge know that this port is offloaded, so that it can assign a /* Let the bridge know that this port is offloaded, so that it can assign a

View File

@ -1226,8 +1226,11 @@ static void sk_psock_verdict_data_ready(struct sock *sk)
rcu_read_lock(); rcu_read_lock();
psock = sk_psock(sk); psock = sk_psock(sk);
if (psock) if (psock) {
psock->saved_data_ready(sk); read_lock_bh(&sk->sk_callback_lock);
sk_psock_data_ready(sk, psock);
read_unlock_bh(&sk->sk_callback_lock);
}
rcu_read_unlock(); rcu_read_unlock();
} }
} }

View File

@ -1188,6 +1188,17 @@ int sk_setsockopt(struct sock *sk, int level, int optname,
*/ */
WRITE_ONCE(sk->sk_txrehash, (u8)val); WRITE_ONCE(sk->sk_txrehash, (u8)val);
return 0; return 0;
case SO_PEEK_OFF:
{
int (*set_peek_off)(struct sock *sk, int val);
set_peek_off = READ_ONCE(sock->ops)->set_peek_off;
if (set_peek_off)
ret = set_peek_off(sk, val);
else
ret = -EOPNOTSUPP;
return ret;
}
} }
sockopt_lock_sock(sk); sockopt_lock_sock(sk);
@ -1430,18 +1441,6 @@ set_sndbuf:
sock_valbool_flag(sk, SOCK_WIFI_STATUS, valbool); sock_valbool_flag(sk, SOCK_WIFI_STATUS, valbool);
break; break;
case SO_PEEK_OFF:
{
int (*set_peek_off)(struct sock *sk, int val);
set_peek_off = READ_ONCE(sock->ops)->set_peek_off;
if (set_peek_off)
ret = set_peek_off(sk, val);
else
ret = -EOPNOTSUPP;
break;
}
case SO_NOFCS: case SO_NOFCS:
sock_valbool_flag(sk, SOCK_NOFCS, valbool); sock_valbool_flag(sk, SOCK_NOFCS, valbool);
break; break;

View File

@ -529,14 +529,20 @@ static int __init devlink_init(void)
{ {
int err; int err;
err = genl_register_family(&devlink_nl_family);
if (err)
goto out;
err = register_pernet_subsys(&devlink_pernet_ops); err = register_pernet_subsys(&devlink_pernet_ops);
if (err) if (err)
goto out; goto out;
err = genl_register_family(&devlink_nl_family);
if (err)
goto out_unreg_pernet_subsys;
err = register_netdevice_notifier(&devlink_port_netdevice_nb); err = register_netdevice_notifier(&devlink_port_netdevice_nb);
if (!err)
return 0;
genl_unregister_family(&devlink_nl_family);
out_unreg_pernet_subsys:
unregister_pernet_subsys(&devlink_pernet_ops);
out: out:
WARN_ON(err); WARN_ON(err);
return err; return err;

View File

@ -583,7 +583,7 @@ devlink_nl_port_get_dump_one(struct sk_buff *msg, struct devlink *devlink,
xa_for_each_start(&devlink->ports, port_index, devlink_port, state->idx) { xa_for_each_start(&devlink->ports, port_index, devlink_port, state->idx) {
err = devlink_nl_port_fill(msg, devlink_port, err = devlink_nl_port_fill(msg, devlink_port,
DEVLINK_CMD_NEW, DEVLINK_CMD_PORT_NEW,
NETLINK_CB(cb->skb).portid, NETLINK_CB(cb->skb).portid,
cb->nlh->nlmsg_seq, flags, cb->nlh->nlmsg_seq, flags,
cb->extack); cb->extack);

View File

@ -1125,7 +1125,8 @@ static int arp_req_get(struct arpreq *r, struct net_device *dev)
if (neigh) { if (neigh) {
if (!(READ_ONCE(neigh->nud_state) & NUD_NOARP)) { if (!(READ_ONCE(neigh->nud_state) & NUD_NOARP)) {
read_lock_bh(&neigh->lock); read_lock_bh(&neigh->lock);
memcpy(r->arp_ha.sa_data, neigh->ha, dev->addr_len); memcpy(r->arp_ha.sa_data, neigh->ha,
min(dev->addr_len, sizeof(r->arp_ha.sa_data_min)));
r->arp_flags = arp_state_to_flags(neigh); r->arp_flags = arp_state_to_flags(neigh);
read_unlock_bh(&neigh->lock); read_unlock_bh(&neigh->lock);
r->arp_ha.sa_family = dev->type; r->arp_ha.sa_family = dev->type;

View File

@ -1825,6 +1825,21 @@ done:
return err; return err;
} }
/* Combine dev_addr_genid and dev_base_seq to detect changes.
*/
static u32 inet_base_seq(const struct net *net)
{
u32 res = atomic_read(&net->ipv4.dev_addr_genid) +
net->dev_base_seq;
/* Must not return 0 (see nl_dump_check_consistent()).
* Chose a value far away from 0.
*/
if (!res)
res = 0x80000000;
return res;
}
static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb) static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
{ {
const struct nlmsghdr *nlh = cb->nlh; const struct nlmsghdr *nlh = cb->nlh;
@ -1876,8 +1891,7 @@ static int inet_dump_ifaddr(struct sk_buff *skb, struct netlink_callback *cb)
idx = 0; idx = 0;
head = &tgt_net->dev_index_head[h]; head = &tgt_net->dev_index_head[h];
rcu_read_lock(); rcu_read_lock();
cb->seq = atomic_read(&tgt_net->ipv4.dev_addr_genid) ^ cb->seq = inet_base_seq(tgt_net);
tgt_net->dev_base_seq;
hlist_for_each_entry_rcu(dev, head, index_hlist) { hlist_for_each_entry_rcu(dev, head, index_hlist) {
if (idx < s_idx) if (idx < s_idx)
goto cont; goto cont;
@ -2278,8 +2292,7 @@ static int inet_netconf_dump_devconf(struct sk_buff *skb,
idx = 0; idx = 0;
head = &net->dev_index_head[h]; head = &net->dev_index_head[h];
rcu_read_lock(); rcu_read_lock();
cb->seq = atomic_read(&net->ipv4.dev_addr_genid) ^ cb->seq = inet_base_seq(net);
net->dev_base_seq;
hlist_for_each_entry_rcu(dev, head, index_hlist) { hlist_for_each_entry_rcu(dev, head, index_hlist) {
if (idx < s_idx) if (idx < s_idx)
goto cont; goto cont;

View File

@ -1130,10 +1130,33 @@ ok:
return 0; return 0;
error: error:
if (sk_hashed(sk)) {
spinlock_t *lock = inet_ehash_lockp(hinfo, sk->sk_hash);
sock_prot_inuse_add(net, sk->sk_prot, -1);
spin_lock(lock);
sk_nulls_del_node_init_rcu(sk);
spin_unlock(lock);
sk->sk_hash = 0;
inet_sk(sk)->inet_sport = 0;
inet_sk(sk)->inet_num = 0;
if (tw)
inet_twsk_bind_unhash(tw, hinfo);
}
spin_unlock(&head2->lock); spin_unlock(&head2->lock);
if (tb_created) if (tb_created)
inet_bind_bucket_destroy(hinfo->bind_bucket_cachep, tb); inet_bind_bucket_destroy(hinfo->bind_bucket_cachep, tb);
spin_unlock_bh(&head->lock); spin_unlock(&head->lock);
if (tw)
inet_twsk_deschedule_put(tw);
local_bh_enable();
return -ENOMEM; return -ENOMEM;
} }

View File

@ -1589,12 +1589,7 @@ int udp_init_sock(struct sock *sk)
void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len) void skb_consume_udp(struct sock *sk, struct sk_buff *skb, int len)
{ {
if (unlikely(READ_ONCE(sk->sk_peek_off) >= 0)) { sk_peek_offset_bwd(sk, len);
bool slow = lock_sock_fast(sk);
sk_peek_offset_bwd(sk, len);
unlock_sock_fast(sk, slow);
}
if (!skb_unref(skb)) if (!skb_unref(skb))
return; return;

View File

@ -708,6 +708,22 @@ errout:
return err; return err;
} }
/* Combine dev_addr_genid and dev_base_seq to detect changes.
*/
static u32 inet6_base_seq(const struct net *net)
{
u32 res = atomic_read(&net->ipv6.dev_addr_genid) +
net->dev_base_seq;
/* Must not return 0 (see nl_dump_check_consistent()).
* Chose a value far away from 0.
*/
if (!res)
res = 0x80000000;
return res;
}
static int inet6_netconf_dump_devconf(struct sk_buff *skb, static int inet6_netconf_dump_devconf(struct sk_buff *skb,
struct netlink_callback *cb) struct netlink_callback *cb)
{ {
@ -741,8 +757,7 @@ static int inet6_netconf_dump_devconf(struct sk_buff *skb,
idx = 0; idx = 0;
head = &net->dev_index_head[h]; head = &net->dev_index_head[h];
rcu_read_lock(); rcu_read_lock();
cb->seq = atomic_read(&net->ipv6.dev_addr_genid) ^ cb->seq = inet6_base_seq(net);
net->dev_base_seq;
hlist_for_each_entry_rcu(dev, head, index_hlist) { hlist_for_each_entry_rcu(dev, head, index_hlist) {
if (idx < s_idx) if (idx < s_idx)
goto cont; goto cont;
@ -5362,7 +5377,7 @@ static int inet6_dump_addr(struct sk_buff *skb, struct netlink_callback *cb,
} }
rcu_read_lock(); rcu_read_lock();
cb->seq = atomic_read(&tgt_net->ipv6.dev_addr_genid) ^ tgt_net->dev_base_seq; cb->seq = inet6_base_seq(tgt_net);
for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) { for (h = s_h; h < NETDEV_HASHENTRIES; h++, s_idx = 0) {
idx = 0; idx = 0;
head = &tgt_net->dev_index_head[h]; head = &tgt_net->dev_index_head[h];

View File

@ -177,6 +177,8 @@ static bool ip6_parse_tlv(bool hopbyhop,
case IPV6_TLV_IOAM: case IPV6_TLV_IOAM:
if (!ipv6_hop_ioam(skb, off)) if (!ipv6_hop_ioam(skb, off))
return false; return false;
nh = skb_network_header(skb);
break; break;
case IPV6_TLV_JUMBO: case IPV6_TLV_JUMBO:
if (!ipv6_hop_jumbo(skb, off)) if (!ipv6_hop_jumbo(skb, off))
@ -943,6 +945,14 @@ static bool ipv6_hop_ioam(struct sk_buff *skb, int optoff)
if (!skb_valid_dst(skb)) if (!skb_valid_dst(skb))
ip6_route_input(skb); ip6_route_input(skb);
/* About to mangle packet header */
if (skb_ensure_writable(skb, optoff + 2 + hdr->opt_len))
goto drop;
/* Trace pointer may have changed */
trace = (struct ioam6_trace_hdr *)(skb_network_header(skb)
+ optoff + sizeof(*hdr));
ioam6_fill_trace_data(skb, ns, trace, true); ioam6_fill_trace_data(skb, ns, trace, true);
break; break;
default: default:

View File

@ -512,22 +512,24 @@ int __init seg6_init(void)
{ {
int err; int err;
err = genl_register_family(&seg6_genl_family); err = register_pernet_subsys(&ip6_segments_ops);
if (err) if (err)
goto out; goto out;
err = register_pernet_subsys(&ip6_segments_ops); err = genl_register_family(&seg6_genl_family);
if (err) if (err)
goto out_unregister_genl; goto out_unregister_pernet;
#ifdef CONFIG_IPV6_SEG6_LWTUNNEL #ifdef CONFIG_IPV6_SEG6_LWTUNNEL
err = seg6_iptunnel_init(); err = seg6_iptunnel_init();
if (err) if (err)
goto out_unregister_pernet; goto out_unregister_genl;
err = seg6_local_init(); err = seg6_local_init();
if (err) if (err) {
goto out_unregister_pernet; seg6_iptunnel_exit();
goto out_unregister_genl;
}
#endif #endif
#ifdef CONFIG_IPV6_SEG6_HMAC #ifdef CONFIG_IPV6_SEG6_HMAC
@ -548,11 +550,11 @@ out_unregister_iptun:
#endif #endif
#endif #endif
#ifdef CONFIG_IPV6_SEG6_LWTUNNEL #ifdef CONFIG_IPV6_SEG6_LWTUNNEL
out_unregister_pernet:
unregister_pernet_subsys(&ip6_segments_ops);
#endif
out_unregister_genl: out_unregister_genl:
genl_unregister_family(&seg6_genl_family); genl_unregister_family(&seg6_genl_family);
#endif
out_unregister_pernet:
unregister_pernet_subsys(&ip6_segments_ops);
goto out; goto out;
} }

View File

@ -156,7 +156,7 @@ static char iucv_error_pathid[16] = "INVALID PATHID";
static LIST_HEAD(iucv_handler_list); static LIST_HEAD(iucv_handler_list);
/* /*
* iucv_path_table: an array of iucv_path structures. * iucv_path_table: array of pointers to iucv_path structures.
*/ */
static struct iucv_path **iucv_path_table; static struct iucv_path **iucv_path_table;
static unsigned long iucv_max_pathid; static unsigned long iucv_max_pathid;
@ -544,7 +544,7 @@ static int iucv_enable(void)
cpus_read_lock(); cpus_read_lock();
rc = -ENOMEM; rc = -ENOMEM;
alloc_size = iucv_max_pathid * sizeof(struct iucv_path); alloc_size = iucv_max_pathid * sizeof(*iucv_path_table);
iucv_path_table = kzalloc(alloc_size, GFP_KERNEL); iucv_path_table = kzalloc(alloc_size, GFP_KERNEL);
if (!iucv_path_table) if (!iucv_path_table)
goto out; goto out;

View File

@ -627,7 +627,7 @@ static int l2tp_ip6_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
back_from_confirm: back_from_confirm:
lock_sock(sk); lock_sock(sk);
ulen = len + skb_queue_empty(&sk->sk_write_queue) ? transhdrlen : 0; ulen = len + (skb_queue_empty(&sk->sk_write_queue) ? transhdrlen : 0);
err = ip6_append_data(sk, ip_generic_getfrag, msg, err = ip6_append_data(sk, ip_generic_getfrag, msg,
ulen, transhdrlen, &ipc6, ulen, transhdrlen, &ipc6,
&fl6, (struct rt6_info *)dst, &fl6, (struct rt6_info *)dst,

View File

@ -663,7 +663,7 @@ struct mctp_sk_key *mctp_alloc_local_tag(struct mctp_sock *msk,
spin_unlock_irqrestore(&mns->keys_lock, flags); spin_unlock_irqrestore(&mns->keys_lock, flags);
if (!tagbits) { if (!tagbits) {
kfree(key); mctp_key_unref(key);
return ERR_PTR(-EBUSY); return ERR_PTR(-EBUSY);
} }

View File

@ -13,17 +13,19 @@
#include <uapi/linux/mptcp.h> #include <uapi/linux/mptcp.h>
#include "protocol.h" #include "protocol.h"
static int subflow_get_info(const struct sock *sk, struct sk_buff *skb) static int subflow_get_info(struct sock *sk, struct sk_buff *skb)
{ {
struct mptcp_subflow_context *sf; struct mptcp_subflow_context *sf;
struct nlattr *start; struct nlattr *start;
u32 flags = 0; u32 flags = 0;
bool slow;
int err; int err;
start = nla_nest_start_noflag(skb, INET_ULP_INFO_MPTCP); start = nla_nest_start_noflag(skb, INET_ULP_INFO_MPTCP);
if (!start) if (!start)
return -EMSGSIZE; return -EMSGSIZE;
slow = lock_sock_fast(sk);
rcu_read_lock(); rcu_read_lock();
sf = rcu_dereference(inet_csk(sk)->icsk_ulp_data); sf = rcu_dereference(inet_csk(sk)->icsk_ulp_data);
if (!sf) { if (!sf) {
@ -63,17 +65,19 @@ static int subflow_get_info(const struct sock *sk, struct sk_buff *skb)
sf->map_data_len) || sf->map_data_len) ||
nla_put_u32(skb, MPTCP_SUBFLOW_ATTR_FLAGS, flags) || nla_put_u32(skb, MPTCP_SUBFLOW_ATTR_FLAGS, flags) ||
nla_put_u8(skb, MPTCP_SUBFLOW_ATTR_ID_REM, sf->remote_id) || nla_put_u8(skb, MPTCP_SUBFLOW_ATTR_ID_REM, sf->remote_id) ||
nla_put_u8(skb, MPTCP_SUBFLOW_ATTR_ID_LOC, sf->local_id)) { nla_put_u8(skb, MPTCP_SUBFLOW_ATTR_ID_LOC, subflow_get_local_id(sf))) {
err = -EMSGSIZE; err = -EMSGSIZE;
goto nla_failure; goto nla_failure;
} }
rcu_read_unlock(); rcu_read_unlock();
unlock_sock_fast(sk, slow);
nla_nest_end(skb, start); nla_nest_end(skb, start);
return 0; return 0;
nla_failure: nla_failure:
rcu_read_unlock(); rcu_read_unlock();
unlock_sock_fast(sk, slow);
nla_nest_cancel(skb, start); nla_nest_cancel(skb, start);
return err; return err;
} }

View File

@ -396,19 +396,6 @@ void mptcp_pm_free_anno_list(struct mptcp_sock *msk)
} }
} }
static bool lookup_address_in_vec(const struct mptcp_addr_info *addrs, unsigned int nr,
const struct mptcp_addr_info *addr)
{
int i;
for (i = 0; i < nr; i++) {
if (addrs[i].id == addr->id)
return true;
}
return false;
}
/* Fill all the remote addresses into the array addrs[], /* Fill all the remote addresses into the array addrs[],
* and return the array size. * and return the array size.
*/ */
@ -440,18 +427,34 @@ static unsigned int fill_remote_addresses_vec(struct mptcp_sock *msk,
msk->pm.subflows++; msk->pm.subflows++;
addrs[i++] = remote; addrs[i++] = remote;
} else { } else {
DECLARE_BITMAP(unavail_id, MPTCP_PM_MAX_ADDR_ID + 1);
/* Forbid creation of new subflows matching existing
* ones, possibly already created by incoming ADD_ADDR
*/
bitmap_zero(unavail_id, MPTCP_PM_MAX_ADDR_ID + 1);
mptcp_for_each_subflow(msk, subflow)
if (READ_ONCE(subflow->local_id) == local->id)
__set_bit(subflow->remote_id, unavail_id);
mptcp_for_each_subflow(msk, subflow) { mptcp_for_each_subflow(msk, subflow) {
ssk = mptcp_subflow_tcp_sock(subflow); ssk = mptcp_subflow_tcp_sock(subflow);
remote_address((struct sock_common *)ssk, &addrs[i]); remote_address((struct sock_common *)ssk, &addrs[i]);
addrs[i].id = subflow->remote_id; addrs[i].id = READ_ONCE(subflow->remote_id);
if (deny_id0 && !addrs[i].id) if (deny_id0 && !addrs[i].id)
continue; continue;
if (test_bit(addrs[i].id, unavail_id))
continue;
if (!mptcp_pm_addr_families_match(sk, local, &addrs[i])) if (!mptcp_pm_addr_families_match(sk, local, &addrs[i]))
continue; continue;
if (!lookup_address_in_vec(addrs, i, &addrs[i]) && if (msk->pm.subflows < subflows_max) {
msk->pm.subflows < subflows_max) { /* forbid creating multiple address towards
* this id
*/
__set_bit(addrs[i].id, unavail_id);
msk->pm.subflows++; msk->pm.subflows++;
i++; i++;
} }
@ -799,18 +802,18 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk,
mptcp_for_each_subflow_safe(msk, subflow, tmp) { mptcp_for_each_subflow_safe(msk, subflow, tmp) {
struct sock *ssk = mptcp_subflow_tcp_sock(subflow); struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
u8 remote_id = READ_ONCE(subflow->remote_id);
int how = RCV_SHUTDOWN | SEND_SHUTDOWN; int how = RCV_SHUTDOWN | SEND_SHUTDOWN;
u8 id = subflow->local_id; u8 id = subflow_get_local_id(subflow);
if (rm_type == MPTCP_MIB_RMADDR && subflow->remote_id != rm_id) if (rm_type == MPTCP_MIB_RMADDR && remote_id != rm_id)
continue; continue;
if (rm_type == MPTCP_MIB_RMSUBFLOW && !mptcp_local_id_match(msk, id, rm_id)) if (rm_type == MPTCP_MIB_RMSUBFLOW && !mptcp_local_id_match(msk, id, rm_id))
continue; continue;
pr_debug(" -> %s rm_list_ids[%d]=%u local_id=%u remote_id=%u mpc_id=%u", pr_debug(" -> %s rm_list_ids[%d]=%u local_id=%u remote_id=%u mpc_id=%u",
rm_type == MPTCP_MIB_RMADDR ? "address" : "subflow", rm_type == MPTCP_MIB_RMADDR ? "address" : "subflow",
i, rm_id, subflow->local_id, subflow->remote_id, i, rm_id, id, remote_id, msk->mpc_endpoint_id);
msk->mpc_endpoint_id);
spin_unlock_bh(&msk->pm.lock); spin_unlock_bh(&msk->pm.lock);
mptcp_subflow_shutdown(sk, ssk, how); mptcp_subflow_shutdown(sk, ssk, how);
@ -901,7 +904,8 @@ static void __mptcp_pm_release_addr_entry(struct mptcp_pm_addr_entry *entry)
} }
static int mptcp_pm_nl_append_new_local_addr(struct pm_nl_pernet *pernet, static int mptcp_pm_nl_append_new_local_addr(struct pm_nl_pernet *pernet,
struct mptcp_pm_addr_entry *entry) struct mptcp_pm_addr_entry *entry,
bool needs_id)
{ {
struct mptcp_pm_addr_entry *cur, *del_entry = NULL; struct mptcp_pm_addr_entry *cur, *del_entry = NULL;
unsigned int addr_max; unsigned int addr_max;
@ -949,7 +953,7 @@ static int mptcp_pm_nl_append_new_local_addr(struct pm_nl_pernet *pernet,
} }
} }
if (!entry->addr.id) { if (!entry->addr.id && needs_id) {
find_next: find_next:
entry->addr.id = find_next_zero_bit(pernet->id_bitmap, entry->addr.id = find_next_zero_bit(pernet->id_bitmap,
MPTCP_PM_MAX_ADDR_ID + 1, MPTCP_PM_MAX_ADDR_ID + 1,
@ -960,7 +964,7 @@ find_next:
} }
} }
if (!entry->addr.id) if (!entry->addr.id && needs_id)
goto out; goto out;
__set_bit(entry->addr.id, pernet->id_bitmap); __set_bit(entry->addr.id, pernet->id_bitmap);
@ -1092,7 +1096,7 @@ int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, struct mptcp_addr_info *skc
entry->ifindex = 0; entry->ifindex = 0;
entry->flags = MPTCP_PM_ADDR_FLAG_IMPLICIT; entry->flags = MPTCP_PM_ADDR_FLAG_IMPLICIT;
entry->lsk = NULL; entry->lsk = NULL;
ret = mptcp_pm_nl_append_new_local_addr(pernet, entry); ret = mptcp_pm_nl_append_new_local_addr(pernet, entry, true);
if (ret < 0) if (ret < 0)
kfree(entry); kfree(entry);
@ -1285,6 +1289,18 @@ next:
return 0; return 0;
} }
static bool mptcp_pm_has_addr_attr_id(const struct nlattr *attr,
struct genl_info *info)
{
struct nlattr *tb[MPTCP_PM_ADDR_ATTR_MAX + 1];
if (!nla_parse_nested_deprecated(tb, MPTCP_PM_ADDR_ATTR_MAX, attr,
mptcp_pm_address_nl_policy, info->extack) &&
tb[MPTCP_PM_ADDR_ATTR_ID])
return true;
return false;
}
int mptcp_pm_nl_add_addr_doit(struct sk_buff *skb, struct genl_info *info) int mptcp_pm_nl_add_addr_doit(struct sk_buff *skb, struct genl_info *info)
{ {
struct nlattr *attr = info->attrs[MPTCP_PM_ENDPOINT_ADDR]; struct nlattr *attr = info->attrs[MPTCP_PM_ENDPOINT_ADDR];
@ -1326,7 +1342,8 @@ int mptcp_pm_nl_add_addr_doit(struct sk_buff *skb, struct genl_info *info)
goto out_free; goto out_free;
} }
} }
ret = mptcp_pm_nl_append_new_local_addr(pernet, entry); ret = mptcp_pm_nl_append_new_local_addr(pernet, entry,
!mptcp_pm_has_addr_attr_id(attr, info));
if (ret < 0) { if (ret < 0) {
GENL_SET_ERR_MSG_FMT(info, "too many addresses or duplicate one: %d", ret); GENL_SET_ERR_MSG_FMT(info, "too many addresses or duplicate one: %d", ret);
goto out_free; goto out_free;
@ -1980,7 +1997,7 @@ static int mptcp_event_add_subflow(struct sk_buff *skb, const struct sock *ssk)
if (WARN_ON_ONCE(!sf)) if (WARN_ON_ONCE(!sf))
return -EINVAL; return -EINVAL;
if (nla_put_u8(skb, MPTCP_ATTR_LOC_ID, sf->local_id)) if (nla_put_u8(skb, MPTCP_ATTR_LOC_ID, subflow_get_local_id(sf)))
return -EMSGSIZE; return -EMSGSIZE;
if (nla_put_u8(skb, MPTCP_ATTR_REM_ID, sf->remote_id)) if (nla_put_u8(skb, MPTCP_ATTR_REM_ID, sf->remote_id))

View File

@ -26,7 +26,8 @@ void mptcp_free_local_addr_list(struct mptcp_sock *msk)
} }
static int mptcp_userspace_pm_append_new_local_addr(struct mptcp_sock *msk, static int mptcp_userspace_pm_append_new_local_addr(struct mptcp_sock *msk,
struct mptcp_pm_addr_entry *entry) struct mptcp_pm_addr_entry *entry,
bool needs_id)
{ {
DECLARE_BITMAP(id_bitmap, MPTCP_PM_MAX_ADDR_ID + 1); DECLARE_BITMAP(id_bitmap, MPTCP_PM_MAX_ADDR_ID + 1);
struct mptcp_pm_addr_entry *match = NULL; struct mptcp_pm_addr_entry *match = NULL;
@ -41,7 +42,7 @@ static int mptcp_userspace_pm_append_new_local_addr(struct mptcp_sock *msk,
spin_lock_bh(&msk->pm.lock); spin_lock_bh(&msk->pm.lock);
list_for_each_entry(e, &msk->pm.userspace_pm_local_addr_list, list) { list_for_each_entry(e, &msk->pm.userspace_pm_local_addr_list, list) {
addr_match = mptcp_addresses_equal(&e->addr, &entry->addr, true); addr_match = mptcp_addresses_equal(&e->addr, &entry->addr, true);
if (addr_match && entry->addr.id == 0) if (addr_match && entry->addr.id == 0 && needs_id)
entry->addr.id = e->addr.id; entry->addr.id = e->addr.id;
id_match = (e->addr.id == entry->addr.id); id_match = (e->addr.id == entry->addr.id);
if (addr_match && id_match) { if (addr_match && id_match) {
@ -64,7 +65,7 @@ static int mptcp_userspace_pm_append_new_local_addr(struct mptcp_sock *msk,
} }
*e = *entry; *e = *entry;
if (!e->addr.id) if (!e->addr.id && needs_id)
e->addr.id = find_next_zero_bit(id_bitmap, e->addr.id = find_next_zero_bit(id_bitmap,
MPTCP_PM_MAX_ADDR_ID + 1, MPTCP_PM_MAX_ADDR_ID + 1,
1); 1);
@ -153,7 +154,7 @@ int mptcp_userspace_pm_get_local_id(struct mptcp_sock *msk,
if (new_entry.addr.port == msk_sport) if (new_entry.addr.port == msk_sport)
new_entry.addr.port = 0; new_entry.addr.port = 0;
return mptcp_userspace_pm_append_new_local_addr(msk, &new_entry); return mptcp_userspace_pm_append_new_local_addr(msk, &new_entry, true);
} }
int mptcp_pm_nl_announce_doit(struct sk_buff *skb, struct genl_info *info) int mptcp_pm_nl_announce_doit(struct sk_buff *skb, struct genl_info *info)
@ -198,7 +199,7 @@ int mptcp_pm_nl_announce_doit(struct sk_buff *skb, struct genl_info *info)
goto announce_err; goto announce_err;
} }
err = mptcp_userspace_pm_append_new_local_addr(msk, &addr_val); err = mptcp_userspace_pm_append_new_local_addr(msk, &addr_val, false);
if (err < 0) { if (err < 0) {
GENL_SET_ERR_MSG(info, "did not match address and id"); GENL_SET_ERR_MSG(info, "did not match address and id");
goto announce_err; goto announce_err;
@ -233,7 +234,7 @@ static int mptcp_userspace_pm_remove_id_zero_address(struct mptcp_sock *msk,
lock_sock(sk); lock_sock(sk);
mptcp_for_each_subflow(msk, subflow) { mptcp_for_each_subflow(msk, subflow) {
if (subflow->local_id == 0) { if (READ_ONCE(subflow->local_id) == 0) {
has_id_0 = true; has_id_0 = true;
break; break;
} }
@ -378,7 +379,7 @@ int mptcp_pm_nl_subflow_create_doit(struct sk_buff *skb, struct genl_info *info)
} }
local.addr = addr_l; local.addr = addr_l;
err = mptcp_userspace_pm_append_new_local_addr(msk, &local); err = mptcp_userspace_pm_append_new_local_addr(msk, &local, false);
if (err < 0) { if (err < 0) {
GENL_SET_ERR_MSG(info, "did not match address and id"); GENL_SET_ERR_MSG(info, "did not match address and id");
goto create_err; goto create_err;

View File

@ -85,7 +85,7 @@ static int __mptcp_socket_create(struct mptcp_sock *msk)
subflow->subflow_id = msk->subflow_id++; subflow->subflow_id = msk->subflow_id++;
/* This is the first subflow, always with id 0 */ /* This is the first subflow, always with id 0 */
subflow->local_id_valid = 1; WRITE_ONCE(subflow->local_id, 0);
mptcp_sock_graft(msk->first, sk->sk_socket); mptcp_sock_graft(msk->first, sk->sk_socket);
iput(SOCK_INODE(ssock)); iput(SOCK_INODE(ssock));

View File

@ -491,10 +491,9 @@ struct mptcp_subflow_context {
remote_key_valid : 1, /* received the peer key from */ remote_key_valid : 1, /* received the peer key from */
disposable : 1, /* ctx can be free at ulp release time */ disposable : 1, /* ctx can be free at ulp release time */
stale : 1, /* unable to snd/rcv data, do not use for xmit */ stale : 1, /* unable to snd/rcv data, do not use for xmit */
local_id_valid : 1, /* local_id is correctly initialized */
valid_csum_seen : 1, /* at least one csum validated */ valid_csum_seen : 1, /* at least one csum validated */
is_mptfo : 1, /* subflow is doing TFO */ is_mptfo : 1, /* subflow is doing TFO */
__unused : 9; __unused : 10;
bool data_avail; bool data_avail;
bool scheduled; bool scheduled;
u32 remote_nonce; u32 remote_nonce;
@ -505,7 +504,7 @@ struct mptcp_subflow_context {
u8 hmac[MPTCPOPT_HMAC_LEN]; /* MPJ subflow only */ u8 hmac[MPTCPOPT_HMAC_LEN]; /* MPJ subflow only */
u64 iasn; /* initial ack sequence number, MPC subflows only */ u64 iasn; /* initial ack sequence number, MPC subflows only */
}; };
u8 local_id; s16 local_id; /* if negative not initialized yet */
u8 remote_id; u8 remote_id;
u8 reset_seen:1; u8 reset_seen:1;
u8 reset_transient:1; u8 reset_transient:1;
@ -556,6 +555,7 @@ mptcp_subflow_ctx_reset(struct mptcp_subflow_context *subflow)
{ {
memset(&subflow->reset, 0, sizeof(subflow->reset)); memset(&subflow->reset, 0, sizeof(subflow->reset));
subflow->request_mptcp = 1; subflow->request_mptcp = 1;
WRITE_ONCE(subflow->local_id, -1);
} }
static inline u64 static inline u64
@ -1022,6 +1022,15 @@ int mptcp_pm_get_local_id(struct mptcp_sock *msk, struct sock_common *skc);
int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, struct mptcp_addr_info *skc); int mptcp_pm_nl_get_local_id(struct mptcp_sock *msk, struct mptcp_addr_info *skc);
int mptcp_userspace_pm_get_local_id(struct mptcp_sock *msk, struct mptcp_addr_info *skc); int mptcp_userspace_pm_get_local_id(struct mptcp_sock *msk, struct mptcp_addr_info *skc);
static inline u8 subflow_get_local_id(const struct mptcp_subflow_context *subflow)
{
int local_id = READ_ONCE(subflow->local_id);
if (local_id < 0)
return 0;
return local_id;
}
void __init mptcp_pm_nl_init(void); void __init mptcp_pm_nl_init(void);
void mptcp_pm_nl_work(struct mptcp_sock *msk); void mptcp_pm_nl_work(struct mptcp_sock *msk);
void mptcp_pm_nl_rm_subflow_received(struct mptcp_sock *msk, void mptcp_pm_nl_rm_subflow_received(struct mptcp_sock *msk,

View File

@ -535,7 +535,7 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb)
subflow->backup = mp_opt.backup; subflow->backup = mp_opt.backup;
subflow->thmac = mp_opt.thmac; subflow->thmac = mp_opt.thmac;
subflow->remote_nonce = mp_opt.nonce; subflow->remote_nonce = mp_opt.nonce;
subflow->remote_id = mp_opt.join_id; WRITE_ONCE(subflow->remote_id, mp_opt.join_id);
pr_debug("subflow=%p, thmac=%llu, remote_nonce=%u backup=%d", pr_debug("subflow=%p, thmac=%llu, remote_nonce=%u backup=%d",
subflow, subflow->thmac, subflow->remote_nonce, subflow, subflow->thmac, subflow->remote_nonce,
subflow->backup); subflow->backup);
@ -577,8 +577,8 @@ do_reset:
static void subflow_set_local_id(struct mptcp_subflow_context *subflow, int local_id) static void subflow_set_local_id(struct mptcp_subflow_context *subflow, int local_id)
{ {
subflow->local_id = local_id; WARN_ON_ONCE(local_id < 0 || local_id > 255);
subflow->local_id_valid = 1; WRITE_ONCE(subflow->local_id, local_id);
} }
static int subflow_chk_local_id(struct sock *sk) static int subflow_chk_local_id(struct sock *sk)
@ -587,7 +587,7 @@ static int subflow_chk_local_id(struct sock *sk)
struct mptcp_sock *msk = mptcp_sk(subflow->conn); struct mptcp_sock *msk = mptcp_sk(subflow->conn);
int err; int err;
if (likely(subflow->local_id_valid)) if (likely(subflow->local_id >= 0))
return 0; return 0;
err = mptcp_pm_get_local_id(msk, (struct sock_common *)sk); err = mptcp_pm_get_local_id(msk, (struct sock_common *)sk);
@ -1567,7 +1567,7 @@ int __mptcp_subflow_connect(struct sock *sk, const struct mptcp_addr_info *loc,
pr_debug("msk=%p remote_token=%u local_id=%d remote_id=%d", msk, pr_debug("msk=%p remote_token=%u local_id=%d remote_id=%d", msk,
remote_token, local_id, remote_id); remote_token, local_id, remote_id);
subflow->remote_token = remote_token; subflow->remote_token = remote_token;
subflow->remote_id = remote_id; WRITE_ONCE(subflow->remote_id, remote_id);
subflow->request_join = 1; subflow->request_join = 1;
subflow->request_bkup = !!(flags & MPTCP_PM_ADDR_FLAG_BACKUP); subflow->request_bkup = !!(flags & MPTCP_PM_ADDR_FLAG_BACKUP);
subflow->subflow_id = msk->subflow_id++; subflow->subflow_id = msk->subflow_id++;
@ -1731,6 +1731,7 @@ static struct mptcp_subflow_context *subflow_create_ctx(struct sock *sk,
pr_debug("subflow=%p", ctx); pr_debug("subflow=%p", ctx);
ctx->tcp_sock = sk; ctx->tcp_sock = sk;
WRITE_ONCE(ctx->local_id, -1);
return ctx; return ctx;
} }
@ -1966,14 +1967,14 @@ static void subflow_ulp_clone(const struct request_sock *req,
new_ctx->idsn = subflow_req->idsn; new_ctx->idsn = subflow_req->idsn;
/* this is the first subflow, id is always 0 */ /* this is the first subflow, id is always 0 */
new_ctx->local_id_valid = 1; subflow_set_local_id(new_ctx, 0);
} else if (subflow_req->mp_join) { } else if (subflow_req->mp_join) {
new_ctx->ssn_offset = subflow_req->ssn_offset; new_ctx->ssn_offset = subflow_req->ssn_offset;
new_ctx->mp_join = 1; new_ctx->mp_join = 1;
new_ctx->fully_established = 1; new_ctx->fully_established = 1;
new_ctx->remote_key_valid = 1; new_ctx->remote_key_valid = 1;
new_ctx->backup = subflow_req->backup; new_ctx->backup = subflow_req->backup;
new_ctx->remote_id = subflow_req->remote_id; WRITE_ONCE(new_ctx->remote_id, subflow_req->remote_id);
new_ctx->token = subflow_req->token; new_ctx->token = subflow_req->token;
new_ctx->thmac = subflow_req->thmac; new_ctx->thmac = subflow_req->thmac;

View File

@ -87,12 +87,22 @@ static u32 flow_offload_dst_cookie(struct flow_offload_tuple *flow_tuple)
return 0; return 0;
} }
static struct dst_entry *nft_route_dst_fetch(struct nf_flow_route *route,
enum flow_offload_tuple_dir dir)
{
struct dst_entry *dst = route->tuple[dir].dst;
route->tuple[dir].dst = NULL;
return dst;
}
static int flow_offload_fill_route(struct flow_offload *flow, static int flow_offload_fill_route(struct flow_offload *flow,
const struct nf_flow_route *route, struct nf_flow_route *route,
enum flow_offload_tuple_dir dir) enum flow_offload_tuple_dir dir)
{ {
struct flow_offload_tuple *flow_tuple = &flow->tuplehash[dir].tuple; struct flow_offload_tuple *flow_tuple = &flow->tuplehash[dir].tuple;
struct dst_entry *dst = route->tuple[dir].dst; struct dst_entry *dst = nft_route_dst_fetch(route, dir);
int i, j = 0; int i, j = 0;
switch (flow_tuple->l3proto) { switch (flow_tuple->l3proto) {
@ -122,6 +132,7 @@ static int flow_offload_fill_route(struct flow_offload *flow,
ETH_ALEN); ETH_ALEN);
flow_tuple->out.ifidx = route->tuple[dir].out.ifindex; flow_tuple->out.ifidx = route->tuple[dir].out.ifindex;
flow_tuple->out.hw_ifidx = route->tuple[dir].out.hw_ifindex; flow_tuple->out.hw_ifidx = route->tuple[dir].out.hw_ifindex;
dst_release(dst);
break; break;
case FLOW_OFFLOAD_XMIT_XFRM: case FLOW_OFFLOAD_XMIT_XFRM:
case FLOW_OFFLOAD_XMIT_NEIGH: case FLOW_OFFLOAD_XMIT_NEIGH:
@ -146,7 +157,7 @@ static void nft_flow_dst_release(struct flow_offload *flow,
} }
void flow_offload_route_init(struct flow_offload *flow, void flow_offload_route_init(struct flow_offload *flow,
const struct nf_flow_route *route) struct nf_flow_route *route)
{ {
flow_offload_fill_route(flow, route, FLOW_OFFLOAD_DIR_ORIGINAL); flow_offload_fill_route(flow, route, FLOW_OFFLOAD_DIR_ORIGINAL);
flow_offload_fill_route(flow, route, FLOW_OFFLOAD_DIR_REPLY); flow_offload_fill_route(flow, route, FLOW_OFFLOAD_DIR_REPLY);

View File

@ -684,15 +684,16 @@ static int nft_delobj(struct nft_ctx *ctx, struct nft_object *obj)
return err; return err;
} }
static int nft_trans_flowtable_add(struct nft_ctx *ctx, int msg_type, static struct nft_trans *
struct nft_flowtable *flowtable) nft_trans_flowtable_add(struct nft_ctx *ctx, int msg_type,
struct nft_flowtable *flowtable)
{ {
struct nft_trans *trans; struct nft_trans *trans;
trans = nft_trans_alloc(ctx, msg_type, trans = nft_trans_alloc(ctx, msg_type,
sizeof(struct nft_trans_flowtable)); sizeof(struct nft_trans_flowtable));
if (trans == NULL) if (trans == NULL)
return -ENOMEM; return ERR_PTR(-ENOMEM);
if (msg_type == NFT_MSG_NEWFLOWTABLE) if (msg_type == NFT_MSG_NEWFLOWTABLE)
nft_activate_next(ctx->net, flowtable); nft_activate_next(ctx->net, flowtable);
@ -701,22 +702,22 @@ static int nft_trans_flowtable_add(struct nft_ctx *ctx, int msg_type,
nft_trans_flowtable(trans) = flowtable; nft_trans_flowtable(trans) = flowtable;
nft_trans_commit_list_add_tail(ctx->net, trans); nft_trans_commit_list_add_tail(ctx->net, trans);
return 0; return trans;
} }
static int nft_delflowtable(struct nft_ctx *ctx, static int nft_delflowtable(struct nft_ctx *ctx,
struct nft_flowtable *flowtable) struct nft_flowtable *flowtable)
{ {
int err; struct nft_trans *trans;
err = nft_trans_flowtable_add(ctx, NFT_MSG_DELFLOWTABLE, flowtable); trans = nft_trans_flowtable_add(ctx, NFT_MSG_DELFLOWTABLE, flowtable);
if (err < 0) if (IS_ERR(trans))
return err; return PTR_ERR(trans);
nft_deactivate_next(ctx->net, flowtable); nft_deactivate_next(ctx->net, flowtable);
nft_use_dec(&ctx->table->use); nft_use_dec(&ctx->table->use);
return err; return 0;
} }
static void __nft_reg_track_clobber(struct nft_regs_track *track, u8 dreg) static void __nft_reg_track_clobber(struct nft_regs_track *track, u8 dreg)
@ -1251,6 +1252,7 @@ static int nf_tables_updtable(struct nft_ctx *ctx)
return 0; return 0;
err_register_hooks: err_register_hooks:
ctx->table->flags |= NFT_TABLE_F_DORMANT;
nft_trans_destroy(trans); nft_trans_destroy(trans);
return ret; return ret;
} }
@ -2080,7 +2082,7 @@ static struct nft_hook *nft_netdev_hook_alloc(struct net *net,
struct nft_hook *hook; struct nft_hook *hook;
int err; int err;
hook = kmalloc(sizeof(struct nft_hook), GFP_KERNEL_ACCOUNT); hook = kzalloc(sizeof(struct nft_hook), GFP_KERNEL_ACCOUNT);
if (!hook) { if (!hook) {
err = -ENOMEM; err = -ENOMEM;
goto err_hook_alloc; goto err_hook_alloc;
@ -2503,19 +2505,15 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
RCU_INIT_POINTER(chain->blob_gen_0, blob); RCU_INIT_POINTER(chain->blob_gen_0, blob);
RCU_INIT_POINTER(chain->blob_gen_1, blob); RCU_INIT_POINTER(chain->blob_gen_1, blob);
err = nf_tables_register_hook(net, table, chain);
if (err < 0)
goto err_destroy_chain;
if (!nft_use_inc(&table->use)) { if (!nft_use_inc(&table->use)) {
err = -EMFILE; err = -EMFILE;
goto err_use; goto err_destroy_chain;
} }
trans = nft_trans_chain_add(ctx, NFT_MSG_NEWCHAIN); trans = nft_trans_chain_add(ctx, NFT_MSG_NEWCHAIN);
if (IS_ERR(trans)) { if (IS_ERR(trans)) {
err = PTR_ERR(trans); err = PTR_ERR(trans);
goto err_unregister_hook; goto err_trans;
} }
nft_trans_chain_policy(trans) = NFT_CHAIN_POLICY_UNSET; nft_trans_chain_policy(trans) = NFT_CHAIN_POLICY_UNSET;
@ -2523,17 +2521,22 @@ static int nf_tables_addchain(struct nft_ctx *ctx, u8 family, u8 genmask,
nft_trans_chain_policy(trans) = policy; nft_trans_chain_policy(trans) = policy;
err = nft_chain_add(table, chain); err = nft_chain_add(table, chain);
if (err < 0) { if (err < 0)
nft_trans_destroy(trans); goto err_chain_add;
goto err_unregister_hook;
} /* This must be LAST to ensure no packets are walking over this chain. */
err = nf_tables_register_hook(net, table, chain);
if (err < 0)
goto err_register_hook;
return 0; return 0;
err_unregister_hook: err_register_hook:
nft_chain_del(chain);
err_chain_add:
nft_trans_destroy(trans);
err_trans:
nft_use_dec_restore(&table->use); nft_use_dec_restore(&table->use);
err_use:
nf_tables_unregister_hook(net, table, chain);
err_destroy_chain: err_destroy_chain:
nf_tables_chain_destroy(ctx); nf_tables_chain_destroy(ctx);
@ -8455,9 +8458,9 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
u8 family = info->nfmsg->nfgen_family; u8 family = info->nfmsg->nfgen_family;
const struct nf_flowtable_type *type; const struct nf_flowtable_type *type;
struct nft_flowtable *flowtable; struct nft_flowtable *flowtable;
struct nft_hook *hook, *next;
struct net *net = info->net; struct net *net = info->net;
struct nft_table *table; struct nft_table *table;
struct nft_trans *trans;
struct nft_ctx ctx; struct nft_ctx ctx;
int err; int err;
@ -8537,34 +8540,34 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
err = nft_flowtable_parse_hook(&ctx, nla, &flowtable_hook, flowtable, err = nft_flowtable_parse_hook(&ctx, nla, &flowtable_hook, flowtable,
extack, true); extack, true);
if (err < 0) if (err < 0)
goto err4; goto err_flowtable_parse_hooks;
list_splice(&flowtable_hook.list, &flowtable->hook_list); list_splice(&flowtable_hook.list, &flowtable->hook_list);
flowtable->data.priority = flowtable_hook.priority; flowtable->data.priority = flowtable_hook.priority;
flowtable->hooknum = flowtable_hook.num; flowtable->hooknum = flowtable_hook.num;
trans = nft_trans_flowtable_add(&ctx, NFT_MSG_NEWFLOWTABLE, flowtable);
if (IS_ERR(trans)) {
err = PTR_ERR(trans);
goto err_flowtable_trans;
}
/* This must be LAST to ensure no packets are walking over this flowtable. */
err = nft_register_flowtable_net_hooks(ctx.net, table, err = nft_register_flowtable_net_hooks(ctx.net, table,
&flowtable->hook_list, &flowtable->hook_list,
flowtable); flowtable);
if (err < 0) {
nft_hooks_destroy(&flowtable->hook_list);
goto err4;
}
err = nft_trans_flowtable_add(&ctx, NFT_MSG_NEWFLOWTABLE, flowtable);
if (err < 0) if (err < 0)
goto err5; goto err_flowtable_hooks;
list_add_tail_rcu(&flowtable->list, &table->flowtables); list_add_tail_rcu(&flowtable->list, &table->flowtables);
return 0; return 0;
err5:
list_for_each_entry_safe(hook, next, &flowtable->hook_list, list) { err_flowtable_hooks:
nft_unregister_flowtable_hook(net, flowtable, hook); nft_trans_destroy(trans);
list_del_rcu(&hook->list); err_flowtable_trans:
kfree_rcu(hook, rcu); nft_hooks_destroy(&flowtable->hook_list);
} err_flowtable_parse_hooks:
err4:
flowtable->data.type->free(&flowtable->data); flowtable->data.type->free(&flowtable->data);
err3: err3:
module_put(type->owner); module_put(type->owner);

View File

@ -34,10 +34,10 @@ static int pn_ioctl(struct sock *sk, int cmd, int *karg)
switch (cmd) { switch (cmd) {
case SIOCINQ: case SIOCINQ:
lock_sock(sk); spin_lock_bh(&sk->sk_receive_queue.lock);
skb = skb_peek(&sk->sk_receive_queue); skb = skb_peek(&sk->sk_receive_queue);
*karg = skb ? skb->len : 0; *karg = skb ? skb->len : 0;
release_sock(sk); spin_unlock_bh(&sk->sk_receive_queue.lock);
return 0; return 0;
case SIOCPNADDRESOURCE: case SIOCPNADDRESOURCE:

View File

@ -917,6 +917,37 @@ static int pep_sock_enable(struct sock *sk, struct sockaddr *addr, int len)
return 0; return 0;
} }
static unsigned int pep_first_packet_length(struct sock *sk)
{
struct pep_sock *pn = pep_sk(sk);
struct sk_buff_head *q;
struct sk_buff *skb;
unsigned int len = 0;
bool found = false;
if (sock_flag(sk, SOCK_URGINLINE)) {
q = &pn->ctrlreq_queue;
spin_lock_bh(&q->lock);
skb = skb_peek(q);
if (skb) {
len = skb->len;
found = true;
}
spin_unlock_bh(&q->lock);
}
if (likely(!found)) {
q = &sk->sk_receive_queue;
spin_lock_bh(&q->lock);
skb = skb_peek(q);
if (skb)
len = skb->len;
spin_unlock_bh(&q->lock);
}
return len;
}
static int pep_ioctl(struct sock *sk, int cmd, int *karg) static int pep_ioctl(struct sock *sk, int cmd, int *karg)
{ {
struct pep_sock *pn = pep_sk(sk); struct pep_sock *pn = pep_sk(sk);
@ -929,15 +960,7 @@ static int pep_ioctl(struct sock *sk, int cmd, int *karg)
break; break;
} }
lock_sock(sk); *karg = pep_first_packet_length(sk);
if (sock_flag(sk, SOCK_URGINLINE) &&
!skb_queue_empty(&pn->ctrlreq_queue))
*karg = skb_peek(&pn->ctrlreq_queue)->len;
else if (!skb_queue_empty(&sk->sk_receive_queue))
*karg = skb_peek(&sk->sk_receive_queue)->len;
else
*karg = 0;
release_sock(sk);
ret = 0; ret = 0;
break; break;

View File

@ -232,18 +232,14 @@ release_idr:
return err; return err;
} }
static bool is_mirred_nested(void) static int
{ tcf_mirred_forward(bool at_ingress, bool want_ingress, struct sk_buff *skb)
return unlikely(__this_cpu_read(mirred_nest_level) > 1);
}
static int tcf_mirred_forward(bool want_ingress, struct sk_buff *skb)
{ {
int err; int err;
if (!want_ingress) if (!want_ingress)
err = tcf_dev_queue_xmit(skb, dev_queue_xmit); err = tcf_dev_queue_xmit(skb, dev_queue_xmit);
else if (is_mirred_nested()) else if (!at_ingress)
err = netif_rx(skb); err = netif_rx(skb);
else else
err = netif_receive_skb(skb); err = netif_receive_skb(skb);
@ -270,8 +266,7 @@ static int tcf_mirred_to_dev(struct sk_buff *skb, struct tcf_mirred *m,
if (unlikely(!(dev->flags & IFF_UP)) || !netif_carrier_ok(dev)) { if (unlikely(!(dev->flags & IFF_UP)) || !netif_carrier_ok(dev)) {
net_notice_ratelimited("tc mirred to Houston: device %s is down\n", net_notice_ratelimited("tc mirred to Houston: device %s is down\n",
dev->name); dev->name);
err = -ENODEV; goto err_cant_do;
goto out;
} }
/* we could easily avoid the clone only if called by ingress and clsact; /* we could easily avoid the clone only if called by ingress and clsact;
@ -283,10 +278,8 @@ static int tcf_mirred_to_dev(struct sk_buff *skb, struct tcf_mirred *m,
tcf_mirred_can_reinsert(retval); tcf_mirred_can_reinsert(retval);
if (!dont_clone) { if (!dont_clone) {
skb_to_send = skb_clone(skb, GFP_ATOMIC); skb_to_send = skb_clone(skb, GFP_ATOMIC);
if (!skb_to_send) { if (!skb_to_send)
err = -ENOMEM; goto err_cant_do;
goto out;
}
} }
want_ingress = tcf_mirred_act_wants_ingress(m_eaction); want_ingress = tcf_mirred_act_wants_ingress(m_eaction);
@ -319,19 +312,20 @@ static int tcf_mirred_to_dev(struct sk_buff *skb, struct tcf_mirred *m,
skb_set_redirected(skb_to_send, skb_to_send->tc_at_ingress); skb_set_redirected(skb_to_send, skb_to_send->tc_at_ingress);
err = tcf_mirred_forward(want_ingress, skb_to_send); err = tcf_mirred_forward(at_ingress, want_ingress, skb_to_send);
} else { } else {
err = tcf_mirred_forward(want_ingress, skb_to_send); err = tcf_mirred_forward(at_ingress, want_ingress, skb_to_send);
} }
if (err)
if (err) {
out:
tcf_action_inc_overlimit_qstats(&m->common); tcf_action_inc_overlimit_qstats(&m->common);
if (is_redirect)
retval = TC_ACT_SHOT;
}
return retval; return retval;
err_cant_do:
if (is_redirect)
retval = TC_ACT_SHOT;
tcf_action_inc_overlimit_qstats(&m->common);
return retval;
} }
static int tcf_blockcast_redir(struct sk_buff *skb, struct tcf_mirred *m, static int tcf_blockcast_redir(struct sk_buff *skb, struct tcf_mirred *m,

View File

@ -2460,8 +2460,11 @@ unbind_filter:
} }
errout_idr: errout_idr:
if (!fold) if (!fold) {
spin_lock(&tp->lock);
idr_remove(&head->handle_idr, fnew->handle); idr_remove(&head->handle_idr, fnew->handle);
spin_unlock(&tp->lock);
}
__fl_put(fnew); __fl_put(fnew);
errout_tb: errout_tb:
kfree(tb); kfree(tb);

View File

@ -19,6 +19,35 @@
#include <linux/rtnetlink.h> #include <linux/rtnetlink.h>
#include <net/switchdev.h> #include <net/switchdev.h>
static bool switchdev_obj_eq(const struct switchdev_obj *a,
const struct switchdev_obj *b)
{
const struct switchdev_obj_port_vlan *va, *vb;
const struct switchdev_obj_port_mdb *ma, *mb;
if (a->id != b->id || a->orig_dev != b->orig_dev)
return false;
switch (a->id) {
case SWITCHDEV_OBJ_ID_PORT_VLAN:
va = SWITCHDEV_OBJ_PORT_VLAN(a);
vb = SWITCHDEV_OBJ_PORT_VLAN(b);
return va->flags == vb->flags &&
va->vid == vb->vid &&
va->changed == vb->changed;
case SWITCHDEV_OBJ_ID_PORT_MDB:
case SWITCHDEV_OBJ_ID_HOST_MDB:
ma = SWITCHDEV_OBJ_PORT_MDB(a);
mb = SWITCHDEV_OBJ_PORT_MDB(b);
return ma->vid == mb->vid &&
ether_addr_equal(ma->addr, mb->addr);
default:
break;
}
BUG();
}
static LIST_HEAD(deferred); static LIST_HEAD(deferred);
static DEFINE_SPINLOCK(deferred_lock); static DEFINE_SPINLOCK(deferred_lock);
@ -307,6 +336,50 @@ int switchdev_port_obj_del(struct net_device *dev,
} }
EXPORT_SYMBOL_GPL(switchdev_port_obj_del); EXPORT_SYMBOL_GPL(switchdev_port_obj_del);
/**
* switchdev_port_obj_act_is_deferred - Is object action pending?
*
* @dev: port device
* @nt: type of action; add or delete
* @obj: object to test
*
* Returns true if a deferred item is pending, which is
* equivalent to the action @nt on an object @obj.
*
* rtnl_lock must be held.
*/
bool switchdev_port_obj_act_is_deferred(struct net_device *dev,
enum switchdev_notifier_type nt,
const struct switchdev_obj *obj)
{
struct switchdev_deferred_item *dfitem;
bool found = false;
ASSERT_RTNL();
spin_lock_bh(&deferred_lock);
list_for_each_entry(dfitem, &deferred, list) {
if (dfitem->dev != dev)
continue;
if ((dfitem->func == switchdev_port_obj_add_deferred &&
nt == SWITCHDEV_PORT_OBJ_ADD) ||
(dfitem->func == switchdev_port_obj_del_deferred &&
nt == SWITCHDEV_PORT_OBJ_DEL)) {
if (switchdev_obj_eq((const void *)dfitem->data, obj)) {
found = true;
break;
}
}
}
spin_unlock_bh(&deferred_lock);
return found;
}
EXPORT_SYMBOL_GPL(switchdev_port_obj_act_is_deferred);
static ATOMIC_NOTIFIER_HEAD(switchdev_notif_chain); static ATOMIC_NOTIFIER_HEAD(switchdev_notif_chain);
static BLOCKING_NOTIFIER_HEAD(switchdev_blocking_notif_chain); static BLOCKING_NOTIFIER_HEAD(switchdev_blocking_notif_chain);

View File

@ -1003,7 +1003,7 @@ static u16 tls_user_config(struct tls_context *ctx, bool tx)
return 0; return 0;
} }
static int tls_get_info(const struct sock *sk, struct sk_buff *skb) static int tls_get_info(struct sock *sk, struct sk_buff *skb)
{ {
u16 version, cipher_type; u16 version, cipher_type;
struct tls_context *ctx; struct tls_context *ctx;

View File

@ -1772,7 +1772,8 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
u8 *control, u8 *control,
size_t skip, size_t skip,
size_t len, size_t len,
bool is_peek) bool is_peek,
bool *more)
{ {
struct sk_buff *skb = skb_peek(&ctx->rx_list); struct sk_buff *skb = skb_peek(&ctx->rx_list);
struct tls_msg *tlm; struct tls_msg *tlm;
@ -1785,7 +1786,7 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
err = tls_record_content_type(msg, tlm, control); err = tls_record_content_type(msg, tlm, control);
if (err <= 0) if (err <= 0)
goto out; goto more;
if (skip < rxm->full_len) if (skip < rxm->full_len)
break; break;
@ -1803,12 +1804,12 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
err = tls_record_content_type(msg, tlm, control); err = tls_record_content_type(msg, tlm, control);
if (err <= 0) if (err <= 0)
goto out; goto more;
err = skb_copy_datagram_msg(skb, rxm->offset + skip, err = skb_copy_datagram_msg(skb, rxm->offset + skip,
msg, chunk); msg, chunk);
if (err < 0) if (err < 0)
goto out; goto more;
len = len - chunk; len = len - chunk;
copied = copied + chunk; copied = copied + chunk;
@ -1844,6 +1845,10 @@ static int process_rx_list(struct tls_sw_context_rx *ctx,
out: out:
return copied ? : err; return copied ? : err;
more:
if (more)
*more = true;
goto out;
} }
static bool static bool
@ -1947,6 +1952,7 @@ int tls_sw_recvmsg(struct sock *sk,
int target, err; int target, err;
bool is_kvec = iov_iter_is_kvec(&msg->msg_iter); bool is_kvec = iov_iter_is_kvec(&msg->msg_iter);
bool is_peek = flags & MSG_PEEK; bool is_peek = flags & MSG_PEEK;
bool rx_more = false;
bool released = true; bool released = true;
bool bpf_strp_enabled; bool bpf_strp_enabled;
bool zc_capable; bool zc_capable;
@ -1966,12 +1972,12 @@ int tls_sw_recvmsg(struct sock *sk,
goto end; goto end;
/* Process pending decrypted records. It must be non-zero-copy */ /* Process pending decrypted records. It must be non-zero-copy */
err = process_rx_list(ctx, msg, &control, 0, len, is_peek); err = process_rx_list(ctx, msg, &control, 0, len, is_peek, &rx_more);
if (err < 0) if (err < 0)
goto end; goto end;
copied = err; copied = err;
if (len <= copied) if (len <= copied || (copied && control != TLS_RECORD_TYPE_DATA) || rx_more)
goto end; goto end;
target = sock_rcvlowat(sk, flags & MSG_WAITALL, len); target = sock_rcvlowat(sk, flags & MSG_WAITALL, len);
@ -2064,6 +2070,8 @@ put_on_rx_list:
decrypted += chunk; decrypted += chunk;
len -= chunk; len -= chunk;
__skb_queue_tail(&ctx->rx_list, skb); __skb_queue_tail(&ctx->rx_list, skb);
if (unlikely(control != TLS_RECORD_TYPE_DATA))
break;
continue; continue;
} }
@ -2128,10 +2136,10 @@ recv_end:
/* Drain records from the rx_list & copy if required */ /* Drain records from the rx_list & copy if required */
if (is_peek || is_kvec) if (is_peek || is_kvec)
err = process_rx_list(ctx, msg, &control, copied, err = process_rx_list(ctx, msg, &control, copied,
decrypted, is_peek); decrypted, is_peek, NULL);
else else
err = process_rx_list(ctx, msg, &control, 0, err = process_rx_list(ctx, msg, &control, 0,
async_copy_bytes, is_peek); async_copy_bytes, is_peek, NULL);
} }
copied += decrypted; copied += decrypted;

View File

@ -782,19 +782,6 @@ static int unix_seqpacket_sendmsg(struct socket *, struct msghdr *, size_t);
static int unix_seqpacket_recvmsg(struct socket *, struct msghdr *, size_t, static int unix_seqpacket_recvmsg(struct socket *, struct msghdr *, size_t,
int); int);
static int unix_set_peek_off(struct sock *sk, int val)
{
struct unix_sock *u = unix_sk(sk);
if (mutex_lock_interruptible(&u->iolock))
return -EINTR;
WRITE_ONCE(sk->sk_peek_off, val);
mutex_unlock(&u->iolock);
return 0;
}
#ifdef CONFIG_PROC_FS #ifdef CONFIG_PROC_FS
static int unix_count_nr_fds(struct sock *sk) static int unix_count_nr_fds(struct sock *sk)
{ {
@ -862,7 +849,7 @@ static const struct proto_ops unix_stream_ops = {
.read_skb = unix_stream_read_skb, .read_skb = unix_stream_read_skb,
.mmap = sock_no_mmap, .mmap = sock_no_mmap,
.splice_read = unix_stream_splice_read, .splice_read = unix_stream_splice_read,
.set_peek_off = unix_set_peek_off, .set_peek_off = sk_set_peek_off,
.show_fdinfo = unix_show_fdinfo, .show_fdinfo = unix_show_fdinfo,
}; };
@ -886,7 +873,7 @@ static const struct proto_ops unix_dgram_ops = {
.read_skb = unix_read_skb, .read_skb = unix_read_skb,
.recvmsg = unix_dgram_recvmsg, .recvmsg = unix_dgram_recvmsg,
.mmap = sock_no_mmap, .mmap = sock_no_mmap,
.set_peek_off = unix_set_peek_off, .set_peek_off = sk_set_peek_off,
.show_fdinfo = unix_show_fdinfo, .show_fdinfo = unix_show_fdinfo,
}; };
@ -909,7 +896,7 @@ static const struct proto_ops unix_seqpacket_ops = {
.sendmsg = unix_seqpacket_sendmsg, .sendmsg = unix_seqpacket_sendmsg,
.recvmsg = unix_seqpacket_recvmsg, .recvmsg = unix_seqpacket_recvmsg,
.mmap = sock_no_mmap, .mmap = sock_no_mmap,
.set_peek_off = unix_set_peek_off, .set_peek_off = sk_set_peek_off,
.show_fdinfo = unix_show_fdinfo, .show_fdinfo = unix_show_fdinfo,
}; };

View File

@ -284,9 +284,17 @@ void unix_gc(void)
* which are creating the cycle(s). * which are creating the cycle(s).
*/ */
skb_queue_head_init(&hitlist); skb_queue_head_init(&hitlist);
list_for_each_entry(u, &gc_candidates, link) list_for_each_entry(u, &gc_candidates, link) {
scan_children(&u->sk, inc_inflight, &hitlist); scan_children(&u->sk, inc_inflight, &hitlist);
#if IS_ENABLED(CONFIG_AF_UNIX_OOB)
if (u->oob_skb) {
kfree_skb(u->oob_skb);
u->oob_skb = NULL;
}
#endif
}
/* not_cycle_list contains those sockets which do not make up a /* not_cycle_list contains those sockets which do not make up a
* cycle. Restore these to the inflight list. * cycle. Restore these to the inflight list.
*/ */
@ -314,18 +322,6 @@ void unix_gc(void)
/* Here we are. Hitlist is filled. Die. */ /* Here we are. Hitlist is filled. Die. */
__skb_queue_purge(&hitlist); __skb_queue_purge(&hitlist);
#if IS_ENABLED(CONFIG_AF_UNIX_OOB)
while (!list_empty(&gc_candidates)) {
u = list_entry(gc_candidates.next, struct unix_sock, link);
if (u->oob_skb) {
struct sk_buff *skb = u->oob_skb;
u->oob_skb = NULL;
kfree_skb(skb);
}
}
#endif
spin_lock(&unix_gc_lock); spin_lock(&unix_gc_lock);
/* There could be io_uring registered files, just push them back to /* There could be io_uring registered files, just push them back to

View File

@ -722,7 +722,8 @@ static struct sk_buff *xsk_build_skb(struct xdp_sock *xs,
memcpy(vaddr, buffer, len); memcpy(vaddr, buffer, len);
kunmap_local(vaddr); kunmap_local(vaddr);
skb_add_rx_frag(skb, nr_frags, page, 0, len, 0); skb_add_rx_frag(skb, nr_frags, page, 0, len, PAGE_SIZE);
refcount_add(PAGE_SIZE, &xs->sk.sk_wmem_alloc);
} }
if (first_frag && desc->options & XDP_TX_METADATA) { if (first_frag && desc->options & XDP_TX_METADATA) {

View File

@ -513,7 +513,7 @@ eBPF programs can have an associated license, passed along with the bytecode
instructions to the kernel when the programs are loaded. The format for that instructions to the kernel when the programs are loaded. The format for that
string is identical to the one in use for kernel modules (Dual licenses, such string is identical to the one in use for kernel modules (Dual licenses, such
as "Dual BSD/GPL", may be used). Some helper functions are only accessible to as "Dual BSD/GPL", may be used). Some helper functions are only accessible to
programs that are compatible with the GNU Privacy License (GPL). programs that are compatible with the GNU General Public License (GNU GPL).
In order to use such helpers, the eBPF program must be loaded with the correct In order to use such helpers, the eBPF program must be loaded with the correct
license string passed (via **attr**) to the **bpf**\\ () system call, and this license string passed (via **attr**) to the **bpf**\\ () system call, and this

View File

@ -466,6 +466,8 @@ ynl_gemsg_start_dump(struct ynl_sock *ys, __u32 id, __u8 cmd, __u8 version)
int ynl_recv_ack(struct ynl_sock *ys, int ret) int ynl_recv_ack(struct ynl_sock *ys, int ret)
{ {
struct ynl_parse_arg yarg = { .ys = ys, };
if (!ret) { if (!ret) {
yerr(ys, YNL_ERROR_EXPECT_ACK, yerr(ys, YNL_ERROR_EXPECT_ACK,
"Expecting an ACK but nothing received"); "Expecting an ACK but nothing received");
@ -478,7 +480,7 @@ int ynl_recv_ack(struct ynl_sock *ys, int ret)
return ret; return ret;
} }
return mnl_cb_run(ys->rx_buf, ret, ys->seq, ys->portid, return mnl_cb_run(ys->rx_buf, ret, ys->seq, ys->portid,
ynl_cb_null, ys); ynl_cb_null, &yarg);
} }
int ynl_cb_null(const struct nlmsghdr *nlh, void *data) int ynl_cb_null(const struct nlmsghdr *nlh, void *data)
@ -586,7 +588,13 @@ static int ynl_sock_read_family(struct ynl_sock *ys, const char *family_name)
return err; return err;
} }
return ynl_recv_ack(ys, err); err = ynl_recv_ack(ys, err);
if (err < 0) {
free(ys->mcast_groups);
return err;
}
return 0;
} }
struct ynl_sock * struct ynl_sock *
@ -741,11 +749,14 @@ err_free:
static int ynl_ntf_trampoline(const struct nlmsghdr *nlh, void *data) static int ynl_ntf_trampoline(const struct nlmsghdr *nlh, void *data)
{ {
return ynl_ntf_parse((struct ynl_sock *)data, nlh); struct ynl_parse_arg *yarg = data;
return ynl_ntf_parse(yarg->ys, nlh);
} }
int ynl_ntf_check(struct ynl_sock *ys) int ynl_ntf_check(struct ynl_sock *ys)
{ {
struct ynl_parse_arg yarg = { .ys = ys, };
ssize_t len; ssize_t len;
int err; int err;
@ -767,7 +778,7 @@ int ynl_ntf_check(struct ynl_sock *ys)
return len; return len;
err = mnl_cb_run2(ys->rx_buf, len, ys->seq, ys->portid, err = mnl_cb_run2(ys->rx_buf, len, ys->seq, ys->portid,
ynl_ntf_trampoline, ys, ynl_ntf_trampoline, &yarg,
ynl_cb_array, NLMSG_MIN_TYPE); ynl_cb_array, NLMSG_MIN_TYPE);
if (err < 0) if (err < 0)
return err; return err;

View File

@ -193,6 +193,7 @@ static void subtest_task_iters(void)
ASSERT_EQ(skel->bss->procs_cnt, 1, "procs_cnt"); ASSERT_EQ(skel->bss->procs_cnt, 1, "procs_cnt");
ASSERT_EQ(skel->bss->threads_cnt, thread_num + 1, "threads_cnt"); ASSERT_EQ(skel->bss->threads_cnt, thread_num + 1, "threads_cnt");
ASSERT_EQ(skel->bss->proc_threads_cnt, thread_num + 1, "proc_threads_cnt"); ASSERT_EQ(skel->bss->proc_threads_cnt, thread_num + 1, "proc_threads_cnt");
ASSERT_EQ(skel->bss->invalid_cnt, 0, "invalid_cnt");
pthread_mutex_unlock(&do_nothing_mutex); pthread_mutex_unlock(&do_nothing_mutex);
for (int i = 0; i < thread_num; i++) for (int i = 0; i < thread_num; i++)
ASSERT_OK(pthread_join(thread_ids[i], &ret), "pthread_join"); ASSERT_OK(pthread_join(thread_ids[i], &ret), "pthread_join");

View File

@ -0,0 +1,57 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (C) 2024. Huawei Technologies Co., Ltd */
#include "test_progs.h"
#include "read_vsyscall.skel.h"
#if defined(__x86_64__)
/* For VSYSCALL_ADDR */
#include <asm/vsyscall.h>
#else
/* To prevent build failure on non-x86 arch */
#define VSYSCALL_ADDR 0UL
#endif
struct read_ret_desc {
const char *name;
int ret;
} all_read[] = {
{ .name = "probe_read_kernel", .ret = -ERANGE },
{ .name = "probe_read_kernel_str", .ret = -ERANGE },
{ .name = "probe_read", .ret = -ERANGE },
{ .name = "probe_read_str", .ret = -ERANGE },
{ .name = "probe_read_user", .ret = -EFAULT },
{ .name = "probe_read_user_str", .ret = -EFAULT },
{ .name = "copy_from_user", .ret = -EFAULT },
{ .name = "copy_from_user_task", .ret = -EFAULT },
};
void test_read_vsyscall(void)
{
struct read_vsyscall *skel;
unsigned int i;
int err;
#if !defined(__x86_64__)
test__skip();
return;
#endif
skel = read_vsyscall__open_and_load();
if (!ASSERT_OK_PTR(skel, "read_vsyscall open_load"))
return;
skel->bss->target_pid = getpid();
err = read_vsyscall__attach(skel);
if (!ASSERT_EQ(err, 0, "read_vsyscall attach"))
goto out;
/* userspace may don't have vsyscall page due to LEGACY_VSYSCALL_NONE,
* but it doesn't affect the returned error codes.
*/
skel->bss->user_ptr = (void *)VSYSCALL_ADDR;
usleep(1);
for (i = 0; i < ARRAY_SIZE(all_read); i++)
ASSERT_EQ(skel->bss->read_ret[i], all_read[i].ret, all_read[i].name);
out:
read_vsyscall__destroy(skel);
}

View File

@ -4,10 +4,29 @@
#include "timer.skel.h" #include "timer.skel.h"
#include "timer_failure.skel.h" #include "timer_failure.skel.h"
#define NUM_THR 8
static void *spin_lock_thread(void *arg)
{
int i, err, prog_fd = *(int *)arg;
LIBBPF_OPTS(bpf_test_run_opts, topts);
for (i = 0; i < 10000; i++) {
err = bpf_prog_test_run_opts(prog_fd, &topts);
if (!ASSERT_OK(err, "test_run_opts err") ||
!ASSERT_OK(topts.retval, "test_run_opts retval"))
break;
}
pthread_exit(arg);
}
static int timer(struct timer *timer_skel) static int timer(struct timer *timer_skel)
{ {
int err, prog_fd; int i, err, prog_fd;
LIBBPF_OPTS(bpf_test_run_opts, topts); LIBBPF_OPTS(bpf_test_run_opts, topts);
pthread_t thread_id[NUM_THR];
void *ret;
err = timer__attach(timer_skel); err = timer__attach(timer_skel);
if (!ASSERT_OK(err, "timer_attach")) if (!ASSERT_OK(err, "timer_attach"))
@ -43,6 +62,20 @@ static int timer(struct timer *timer_skel)
/* check that code paths completed */ /* check that code paths completed */
ASSERT_EQ(timer_skel->bss->ok, 1 | 2 | 4, "ok"); ASSERT_EQ(timer_skel->bss->ok, 1 | 2 | 4, "ok");
prog_fd = bpf_program__fd(timer_skel->progs.race);
for (i = 0; i < NUM_THR; i++) {
err = pthread_create(&thread_id[i], NULL,
&spin_lock_thread, &prog_fd);
if (!ASSERT_OK(err, "pthread_create"))
break;
}
while (i) {
err = pthread_join(thread_id[--i], &ret);
if (ASSERT_OK(err, "pthread_join"))
ASSERT_EQ(ret, (void *)&prog_fd, "pthread_join");
}
return 0; return 0;
} }

View File

@ -10,7 +10,7 @@
char _license[] SEC("license") = "GPL"; char _license[] SEC("license") = "GPL";
pid_t target_pid; pid_t target_pid;
int procs_cnt, threads_cnt, proc_threads_cnt; int procs_cnt, threads_cnt, proc_threads_cnt, invalid_cnt;
void bpf_rcu_read_lock(void) __ksym; void bpf_rcu_read_lock(void) __ksym;
void bpf_rcu_read_unlock(void) __ksym; void bpf_rcu_read_unlock(void) __ksym;
@ -26,6 +26,16 @@ int iter_task_for_each_sleep(void *ctx)
procs_cnt = threads_cnt = proc_threads_cnt = 0; procs_cnt = threads_cnt = proc_threads_cnt = 0;
bpf_rcu_read_lock(); bpf_rcu_read_lock();
bpf_for_each(task, pos, NULL, ~0U) {
/* Below instructions shouldn't be executed for invalid flags */
invalid_cnt++;
}
bpf_for_each(task, pos, NULL, BPF_TASK_ITER_PROC_THREADS) {
/* Below instructions shouldn't be executed for invalid task__nullable */
invalid_cnt++;
}
bpf_for_each(task, pos, NULL, BPF_TASK_ITER_ALL_PROCS) bpf_for_each(task, pos, NULL, BPF_TASK_ITER_ALL_PROCS)
if (pos->pid == target_pid) if (pos->pid == target_pid)
procs_cnt++; procs_cnt++;

View File

@ -0,0 +1,45 @@
// SPDX-License-Identifier: GPL-2.0
/* Copyright (C) 2024. Huawei Technologies Co., Ltd */
#include <linux/types.h>
#include <bpf/bpf_helpers.h>
#include "bpf_misc.h"
int target_pid = 0;
void *user_ptr = 0;
int read_ret[8];
char _license[] SEC("license") = "GPL";
SEC("fentry/" SYS_PREFIX "sys_nanosleep")
int do_probe_read(void *ctx)
{
char buf[8];
if ((bpf_get_current_pid_tgid() >> 32) != target_pid)
return 0;
read_ret[0] = bpf_probe_read_kernel(buf, sizeof(buf), user_ptr);
read_ret[1] = bpf_probe_read_kernel_str(buf, sizeof(buf), user_ptr);
read_ret[2] = bpf_probe_read(buf, sizeof(buf), user_ptr);
read_ret[3] = bpf_probe_read_str(buf, sizeof(buf), user_ptr);
read_ret[4] = bpf_probe_read_user(buf, sizeof(buf), user_ptr);
read_ret[5] = bpf_probe_read_user_str(buf, sizeof(buf), user_ptr);
return 0;
}
SEC("fentry.s/" SYS_PREFIX "sys_nanosleep")
int do_copy_from_user(void *ctx)
{
char buf[8];
if ((bpf_get_current_pid_tgid() >> 32) != target_pid)
return 0;
read_ret[6] = bpf_copy_from_user(buf, sizeof(buf), user_ptr);
read_ret[7] = bpf_copy_from_user_task(buf, sizeof(buf), user_ptr,
bpf_get_current_task_btf(), 0);
return 0;
}

View File

@ -51,7 +51,8 @@ struct {
__uint(max_entries, 1); __uint(max_entries, 1);
__type(key, int); __type(key, int);
__type(value, struct elem); __type(value, struct elem);
} abs_timer SEC(".maps"), soft_timer_pinned SEC(".maps"), abs_timer_pinned SEC(".maps"); } abs_timer SEC(".maps"), soft_timer_pinned SEC(".maps"), abs_timer_pinned SEC(".maps"),
race_array SEC(".maps");
__u64 bss_data; __u64 bss_data;
__u64 abs_data; __u64 abs_data;
@ -390,3 +391,34 @@ int BPF_PROG2(test5, int, a)
return 0; return 0;
} }
static int race_timer_callback(void *race_array, int *race_key, struct bpf_timer *timer)
{
bpf_timer_start(timer, 1000000, 0);
return 0;
}
SEC("syscall")
int race(void *ctx)
{
struct bpf_timer *timer;
int err, race_key = 0;
struct elem init;
__builtin_memset(&init, 0, sizeof(struct elem));
bpf_map_update_elem(&race_array, &race_key, &init, BPF_ANY);
timer = bpf_map_lookup_elem(&race_array, &race_key);
if (!timer)
return 1;
err = bpf_timer_init(timer, &race_array, CLOCK_MONOTONIC);
if (err && err != -EBUSY)
return 1;
bpf_timer_set_callback(timer, race_timer_callback);
bpf_timer_start(timer, 0, 0);
bpf_timer_cancel(timer);
return 0;
}

View File

@ -62,6 +62,8 @@ prio_test()
# create bond # create bond
bond_reset "${param}" bond_reset "${param}"
# set active_slave to primary eth1 specifically
ip -n ${s_ns} link set bond0 type bond active_slave eth1
# check bonding member prio value # check bonding member prio value
ip -n ${s_ns} link set eth0 type bond_slave prio 0 ip -n ${s_ns} link set eth0 type bond_slave prio 0

View File

@ -235,9 +235,6 @@ mirred_egress_to_ingress_tcp_test()
check_err $? "didn't mirred redirect ICMP" check_err $? "didn't mirred redirect ICMP"
tc_check_packets "dev $h1 ingress" 102 10 tc_check_packets "dev $h1 ingress" 102 10
check_err $? "didn't drop mirred ICMP" check_err $? "didn't drop mirred ICMP"
local overlimits=$(tc_rule_stats_get ${h1} 101 egress .overlimits)
test ${overlimits} = 10
check_err $? "wrong overlimits, expected 10 got ${overlimits}"
tc filter del dev $h1 egress protocol ip pref 100 handle 100 flower tc filter del dev $h1 egress protocol ip pref 100 handle 100 flower
tc filter del dev $h1 egress protocol ip pref 101 handle 101 flower tc filter del dev $h1 egress protocol ip pref 101 handle 101 flower

View File

@ -367,14 +367,12 @@ run_test()
local desc=$2 local desc=$2
local node_src=$3 local node_src=$3
local node_dst=$4 local node_dst=$4
local ip6_src=$5 local ip6_dst=$5
local ip6_dst=$6 local trace_type=$6
local if_dst=$7 local ioam_ns=$7
local trace_type=$8 local type=$8
local ioam_ns=$9
ip netns exec $node_dst ./ioam6_parser $if_dst $name $ip6_src $ip6_dst \ ip netns exec $node_dst ./ioam6_parser $name $trace_type $ioam_ns $type &
$trace_type $ioam_ns &
local spid=$! local spid=$!
sleep 0.1 sleep 0.1
@ -489,7 +487,7 @@ out_undef_ns()
trace prealloc type 0x800000 ns 0 size 4 dev veth0 trace prealloc type 0x800000 ns 0 size 4 dev veth0
run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \
db01::2 db01::1 veth0 0x800000 0 db01::1 0x800000 0 $1
[ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down
} }
@ -509,7 +507,7 @@ out_no_room()
trace prealloc type 0xc00000 ns 123 size 4 dev veth0 trace prealloc type 0xc00000 ns 123 size 4 dev veth0
run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \
db01::2 db01::1 veth0 0xc00000 123 db01::1 0xc00000 123 $1
[ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down
} }
@ -543,14 +541,14 @@ out_bits()
if [ $cmd_res != 0 ] if [ $cmd_res != 0 ]
then then
npassed=$((npassed+1)) npassed=$((npassed+1))
log_test_passed "$descr" log_test_passed "$descr ($1 mode)"
else else
nfailed=$((nfailed+1)) nfailed=$((nfailed+1))
log_test_failed "$descr" log_test_failed "$descr ($1 mode)"
fi fi
else else
run_test "out_bit$i" "$descr ($1 mode)" $ioam_node_alpha \ run_test "out_bit$i" "$descr ($1 mode)" $ioam_node_alpha \
$ioam_node_beta db01::2 db01::1 veth0 ${bit2type[$i]} 123 $ioam_node_beta db01::1 ${bit2type[$i]} 123 $1
fi fi
done done
@ -574,7 +572,7 @@ out_full_supp_trace()
trace prealloc type 0xfff002 ns 123 size 100 dev veth0 trace prealloc type 0xfff002 ns 123 size 100 dev veth0
run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \
db01::2 db01::1 veth0 0xfff002 123 db01::1 0xfff002 123 $1
[ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down
} }
@ -604,7 +602,7 @@ in_undef_ns()
trace prealloc type 0x800000 ns 0 size 4 dev veth0 trace prealloc type 0x800000 ns 0 size 4 dev veth0
run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \
db01::2 db01::1 veth0 0x800000 0 db01::1 0x800000 0 $1
[ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down
} }
@ -624,7 +622,7 @@ in_no_room()
trace prealloc type 0xc00000 ns 123 size 4 dev veth0 trace prealloc type 0xc00000 ns 123 size 4 dev veth0
run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \
db01::2 db01::1 veth0 0xc00000 123 db01::1 0xc00000 123 $1
[ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down
} }
@ -651,7 +649,7 @@ in_bits()
dev veth0 dev veth0
run_test "in_bit$i" "${desc/<n>/$i} ($1 mode)" $ioam_node_alpha \ run_test "in_bit$i" "${desc/<n>/$i} ($1 mode)" $ioam_node_alpha \
$ioam_node_beta db01::2 db01::1 veth0 ${bit2type[$i]} 123 $ioam_node_beta db01::1 ${bit2type[$i]} 123 $1
done done
[ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down
@ -679,7 +677,7 @@ in_oflag()
trace prealloc type 0xc00000 ns 123 size 4 dev veth0 trace prealloc type 0xc00000 ns 123 size 4 dev veth0
run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \
db01::2 db01::1 veth0 0xc00000 123 db01::1 0xc00000 123 $1
[ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down
@ -703,7 +701,7 @@ in_full_supp_trace()
trace prealloc type 0xfff002 ns 123 size 80 dev veth0 trace prealloc type 0xfff002 ns 123 size 80 dev veth0
run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \ run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_beta \
db01::2 db01::1 veth0 0xfff002 123 db01::1 0xfff002 123 $1
[ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down [ "$1" = "encap" ] && ip -netns $ioam_node_beta link set ip6tnl0 down
} }
@ -731,7 +729,7 @@ fwd_full_supp_trace()
trace prealloc type 0xfff002 ns 123 size 244 via db01::1 dev veth0 trace prealloc type 0xfff002 ns 123 size 244 via db01::1 dev veth0
run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_gamma \ run_test ${FUNCNAME[0]} "${desc} ($1 mode)" $ioam_node_alpha $ioam_node_gamma \
db01::2 db02::2 veth0 0xfff002 123 db02::2 0xfff002 123 $1
[ "$1" = "encap" ] && ip -netns $ioam_node_gamma link set ip6tnl0 down [ "$1" = "encap" ] && ip -netns $ioam_node_gamma link set ip6tnl0 down
} }

View File

@ -8,7 +8,6 @@
#include <errno.h> #include <errno.h>
#include <limits.h> #include <limits.h>
#include <linux/const.h> #include <linux/const.h>
#include <linux/if_ether.h>
#include <linux/ioam6.h> #include <linux/ioam6.h>
#include <linux/ipv6.h> #include <linux/ipv6.h>
#include <stdlib.h> #include <stdlib.h>
@ -512,14 +511,6 @@ static int str2id(const char *tname)
return -1; return -1;
} }
static int ipv6_addr_equal(const struct in6_addr *a1, const struct in6_addr *a2)
{
return ((a1->s6_addr32[0] ^ a2->s6_addr32[0]) |
(a1->s6_addr32[1] ^ a2->s6_addr32[1]) |
(a1->s6_addr32[2] ^ a2->s6_addr32[2]) |
(a1->s6_addr32[3] ^ a2->s6_addr32[3])) == 0;
}
static int get_u32(__u32 *val, const char *arg, int base) static int get_u32(__u32 *val, const char *arg, int base)
{ {
unsigned long res; unsigned long res;
@ -603,70 +594,80 @@ static int (*func[__TEST_MAX])(int, struct ioam6_trace_hdr *, __u32, __u16) = {
int main(int argc, char **argv) int main(int argc, char **argv)
{ {
int fd, size, hoplen, tid, ret = 1; int fd, size, hoplen, tid, ret = 1, on = 1;
struct in6_addr src, dst;
struct ioam6_hdr *opt; struct ioam6_hdr *opt;
struct ipv6hdr *ip6h; struct cmsghdr *cmsg;
__u8 buffer[400], *p; struct msghdr msg;
__u16 ioam_ns; struct iovec iov;
__u8 buffer[512];
__u32 tr_type; __u32 tr_type;
__u16 ioam_ns;
__u8 *ptr;
if (argc != 7) if (argc != 5)
goto out; goto out;
tid = str2id(argv[2]); tid = str2id(argv[1]);
if (tid < 0 || !func[tid]) if (tid < 0 || !func[tid])
goto out; goto out;
if (inet_pton(AF_INET6, argv[3], &src) != 1 || if (get_u32(&tr_type, argv[2], 16) ||
inet_pton(AF_INET6, argv[4], &dst) != 1) get_u16(&ioam_ns, argv[3], 0))
goto out; goto out;
if (get_u32(&tr_type, argv[5], 16) || fd = socket(PF_INET6, SOCK_RAW,
get_u16(&ioam_ns, argv[6], 0)) !strcmp(argv[4], "encap") ? IPPROTO_IPV6 : IPPROTO_ICMPV6);
if (fd < 0)
goto out; goto out;
fd = socket(AF_PACKET, SOCK_DGRAM, __cpu_to_be16(ETH_P_IPV6)); setsockopt(fd, IPPROTO_IPV6, IPV6_RECVHOPOPTS, &on, sizeof(on));
if (!fd)
goto out;
if (setsockopt(fd, SOL_SOCKET, SO_BINDTODEVICE, iov.iov_len = 1;
argv[1], strlen(argv[1]))) iov.iov_base = malloc(CMSG_SPACE(sizeof(buffer)));
if (!iov.iov_base)
goto close; goto close;
recv: recv:
size = recv(fd, buffer, sizeof(buffer), 0); memset(&msg, 0, sizeof(msg));
msg.msg_iov = &iov;
msg.msg_iovlen = 1;
msg.msg_control = buffer;
msg.msg_controllen = CMSG_SPACE(sizeof(buffer));
size = recvmsg(fd, &msg, 0);
if (size <= 0) if (size <= 0)
goto close; goto close;
ip6h = (struct ipv6hdr *)buffer; for (cmsg = CMSG_FIRSTHDR(&msg); cmsg; cmsg = CMSG_NXTHDR(&msg, cmsg)) {
if (cmsg->cmsg_level != IPPROTO_IPV6 ||
cmsg->cmsg_type != IPV6_HOPOPTS ||
cmsg->cmsg_len < sizeof(struct ipv6_hopopt_hdr))
continue;
if (!ipv6_addr_equal(&ip6h->saddr, &src) || ptr = (__u8 *)CMSG_DATA(cmsg);
!ipv6_addr_equal(&ip6h->daddr, &dst))
goto recv;
if (ip6h->nexthdr != IPPROTO_HOPOPTS) hoplen = (ptr[1] + 1) << 3;
goto close; ptr += sizeof(struct ipv6_hopopt_hdr);
p = buffer + sizeof(*ip6h); while (hoplen > 0) {
hoplen = (p[1] + 1) << 3; opt = (struct ioam6_hdr *)ptr;
p += sizeof(struct ipv6_hopopt_hdr);
while (hoplen > 0) { if (opt->opt_type == IPV6_TLV_IOAM &&
opt = (struct ioam6_hdr *)p; opt->type == IOAM6_TYPE_PREALLOC) {
ptr += sizeof(*opt);
ret = func[tid](tid,
(struct ioam6_trace_hdr *)ptr,
tr_type, ioam_ns);
goto close;
}
if (opt->opt_type == IPV6_TLV_IOAM && ptr += opt->opt_len + 2;
opt->type == IOAM6_TYPE_PREALLOC) { hoplen -= opt->opt_len + 2;
p += sizeof(*opt);
ret = func[tid](tid, (struct ioam6_trace_hdr *)p,
tr_type, ioam_ns);
break;
} }
p += opt->opt_len + 2;
hoplen -= opt->opt_len + 2;
} }
goto recv;
close: close:
free(iov.iov_base);
close(fd); close(fd);
out: out:
return ret; return ret;

View File

@ -62,8 +62,8 @@ __chk_nr()
nr=$(eval $command) nr=$(eval $command)
printf "%-50s" "$msg" printf "%-50s" "$msg"
if [ $nr != $expected ]; then if [ "$nr" != "$expected" ]; then
if [ $nr = "$skip" ] && ! mptcp_lib_expect_all_features; then if [ "$nr" = "$skip" ] && ! mptcp_lib_expect_all_features; then
echo "[ skip ] Feature probably not supported" echo "[ skip ] Feature probably not supported"
mptcp_lib_result_skip "${msg}" mptcp_lib_result_skip "${msg}"
else else
@ -166,9 +166,13 @@ chk_msk_listen()
chk_msk_inuse() chk_msk_inuse()
{ {
local expected=$1 local expected=$1
local msg="$2" local msg="....chk ${2:-${expected}} msk in use"
local listen_nr local listen_nr
if [ "${expected}" -eq 0 ]; then
msg+=" after flush"
fi
listen_nr=$(ss -N "${ns}" -Ml | grep -c LISTEN) listen_nr=$(ss -N "${ns}" -Ml | grep -c LISTEN)
expected=$((expected + listen_nr)) expected=$((expected + listen_nr))
@ -179,16 +183,21 @@ chk_msk_inuse()
sleep 0.1 sleep 0.1
done done
__chk_nr get_msk_inuse $expected "$msg" 0 __chk_nr get_msk_inuse $expected "${msg}" 0
} }
# $1: cestab nr # $1: cestab nr
chk_msk_cestab() chk_msk_cestab()
{ {
local cestab=$1 local expected=$1
local msg="....chk ${2:-${expected}} cestab"
if [ "${expected}" -eq 0 ]; then
msg+=" after flush"
fi
__chk_nr "mptcp_lib_get_counter ${ns} MPTcpExtMPCurrEstab" \ __chk_nr "mptcp_lib_get_counter ${ns} MPTcpExtMPCurrEstab" \
"${cestab}" "....chk ${cestab} cestab" "" "${expected}" "${msg}" ""
} }
wait_connected() wait_connected()
@ -227,12 +236,12 @@ wait_connected $ns 10000
chk_msk_nr 2 "after MPC handshake " chk_msk_nr 2 "after MPC handshake "
chk_msk_remote_key_nr 2 "....chk remote_key" chk_msk_remote_key_nr 2 "....chk remote_key"
chk_msk_fallback_nr 0 "....chk no fallback" chk_msk_fallback_nr 0 "....chk no fallback"
chk_msk_inuse 2 "....chk 2 msk in use" chk_msk_inuse 2
chk_msk_cestab 2 chk_msk_cestab 2
flush_pids flush_pids
chk_msk_inuse 0 "....chk 0 msk in use after flush" chk_msk_inuse 0 "2->0"
chk_msk_cestab 0 chk_msk_cestab 0 "2->0"
echo "a" | \ echo "a" | \
timeout ${timeout_test} \ timeout ${timeout_test} \
@ -247,12 +256,12 @@ echo "b" | \
127.0.0.1 >/dev/null & 127.0.0.1 >/dev/null &
wait_connected $ns 10001 wait_connected $ns 10001
chk_msk_fallback_nr 1 "check fallback" chk_msk_fallback_nr 1 "check fallback"
chk_msk_inuse 1 "....chk 1 msk in use" chk_msk_inuse 1
chk_msk_cestab 1 chk_msk_cestab 1
flush_pids flush_pids
chk_msk_inuse 0 "....chk 0 msk in use after flush" chk_msk_inuse 0 "1->0"
chk_msk_cestab 0 chk_msk_cestab 0 "1->0"
NR_CLIENTS=100 NR_CLIENTS=100
for I in `seq 1 $NR_CLIENTS`; do for I in `seq 1 $NR_CLIENTS`; do
@ -273,12 +282,12 @@ for I in `seq 1 $NR_CLIENTS`; do
done done
wait_msk_nr $((NR_CLIENTS*2)) "many msk socket present" wait_msk_nr $((NR_CLIENTS*2)) "many msk socket present"
chk_msk_inuse $((NR_CLIENTS*2)) "....chk many msk in use" chk_msk_inuse $((NR_CLIENTS*2)) "many"
chk_msk_cestab $((NR_CLIENTS*2)) chk_msk_cestab $((NR_CLIENTS*2)) "many"
flush_pids flush_pids
chk_msk_inuse 0 "....chk 0 msk in use after flush" chk_msk_inuse 0 "many->0"
chk_msk_cestab 0 chk_msk_cestab 0 "many->0"
mptcp_lib_result_print_all_tap mptcp_lib_result_print_all_tap
exit $ret exit $ret

View File

@ -183,7 +183,7 @@ check "ip netns exec $ns1 ./pm_nl_ctl dump" "id 1 flags \
subflow 10.0.1.1" " (nobackup)" subflow 10.0.1.1" " (nobackup)"
# fullmesh support has been added later # fullmesh support has been added later
ip netns exec $ns1 ./pm_nl_ctl set id 1 flags fullmesh ip netns exec $ns1 ./pm_nl_ctl set id 1 flags fullmesh 2>/dev/null
if ip netns exec $ns1 ./pm_nl_ctl dump | grep -q "fullmesh" || if ip netns exec $ns1 ./pm_nl_ctl dump | grep -q "fullmesh" ||
mptcp_lib_expect_all_features; then mptcp_lib_expect_all_features; then
check "ip netns exec $ns1 ./pm_nl_ctl dump" "id 1 flags \ check "ip netns exec $ns1 ./pm_nl_ctl dump" "id 1 flags \
@ -194,6 +194,12 @@ subflow 10.0.1.1" " (nofullmesh)"
ip netns exec $ns1 ./pm_nl_ctl set id 1 flags backup,fullmesh ip netns exec $ns1 ./pm_nl_ctl set id 1 flags backup,fullmesh
check "ip netns exec $ns1 ./pm_nl_ctl dump" "id 1 flags \ check "ip netns exec $ns1 ./pm_nl_ctl dump" "id 1 flags \
subflow,backup,fullmesh 10.0.1.1" " (backup,fullmesh)" subflow,backup,fullmesh 10.0.1.1" " (backup,fullmesh)"
else
for st in fullmesh nofullmesh backup,fullmesh; do
st=" (${st})"
printf "%-50s%s\n" "${st}" "[SKIP]"
mptcp_lib_result_skip "${st}"
done
fi fi
mptcp_lib_result_print_all_tap mptcp_lib_result_print_all_tap

View File

@ -250,7 +250,8 @@ run_test()
[ $bail -eq 0 ] || exit $ret [ $bail -eq 0 ] || exit $ret
fi fi
printf "%-60s" "$msg - reverse direction" msg+=" - reverse direction"
printf "%-60s" "${msg}"
do_transfer $large $small $time do_transfer $large $small $time
lret=$? lret=$?
mptcp_lib_result_code "${lret}" "${msg}" mptcp_lib_result_code "${lret}" "${msg}"

View File

@ -75,7 +75,7 @@ print_test()
{ {
test_name="${1}" test_name="${1}"
_printf "%-63s" "${test_name}" _printf "%-68s" "${test_name}"
} }
print_results() print_results()
@ -542,7 +542,7 @@ verify_subflow_events()
local remid local remid
local info local info
info="${e_saddr} (${e_from}) => ${e_daddr} (${e_to})" info="${e_saddr} (${e_from}) => ${e_daddr}:${e_dport} (${e_to})"
if [ "$e_type" = "$SUB_ESTABLISHED" ] if [ "$e_type" = "$SUB_ESTABLISHED" ]
then then

View File

@ -1485,6 +1485,51 @@ TEST_F(tls, control_msg)
EXPECT_EQ(memcmp(buf, test_str, send_len), 0); EXPECT_EQ(memcmp(buf, test_str, send_len), 0);
} }
TEST_F(tls, control_msg_nomerge)
{
char *rec1 = "1111";
char *rec2 = "2222";
int send_len = 5;
char buf[15];
if (self->notls)
SKIP(return, "no TLS support");
EXPECT_EQ(tls_send_cmsg(self->fd, 100, rec1, send_len, 0), send_len);
EXPECT_EQ(tls_send_cmsg(self->fd, 100, rec2, send_len, 0), send_len);
EXPECT_EQ(tls_recv_cmsg(_metadata, self->cfd, 100, buf, sizeof(buf), MSG_PEEK), send_len);
EXPECT_EQ(memcmp(buf, rec1, send_len), 0);
EXPECT_EQ(tls_recv_cmsg(_metadata, self->cfd, 100, buf, sizeof(buf), MSG_PEEK), send_len);
EXPECT_EQ(memcmp(buf, rec1, send_len), 0);
EXPECT_EQ(tls_recv_cmsg(_metadata, self->cfd, 100, buf, sizeof(buf), 0), send_len);
EXPECT_EQ(memcmp(buf, rec1, send_len), 0);
EXPECT_EQ(tls_recv_cmsg(_metadata, self->cfd, 100, buf, sizeof(buf), 0), send_len);
EXPECT_EQ(memcmp(buf, rec2, send_len), 0);
}
TEST_F(tls, data_control_data)
{
char *rec1 = "1111";
char *rec2 = "2222";
char *rec3 = "3333";
int send_len = 5;
char buf[15];
if (self->notls)
SKIP(return, "no TLS support");
EXPECT_EQ(send(self->fd, rec1, send_len, 0), send_len);
EXPECT_EQ(tls_send_cmsg(self->fd, 100, rec2, send_len, 0), send_len);
EXPECT_EQ(send(self->fd, rec3, send_len, 0), send_len);
EXPECT_EQ(recv(self->cfd, buf, sizeof(buf), MSG_PEEK), send_len);
EXPECT_EQ(recv(self->cfd, buf, sizeof(buf), MSG_PEEK), send_len);
}
TEST_F(tls, shutdown) TEST_F(tls, shutdown)
{ {
char const *test_str = "test_read"; char const *test_str = "test_read";