linux-next/net/sched
Ido Schimmel 32f2a0afa9 net/sched: flower: Fix chain template offload
When a qdisc is deleted from a net device the stack instructs the
underlying driver to remove its flow offload callback from the
associated filter block using the 'FLOW_BLOCK_UNBIND' command. The stack
then continues to replay the removal of the filters in the block for
this driver by iterating over the chains in the block and invoking the
'reoffload' operation of the classifier being used. In turn, the
classifier in its 'reoffload' operation prepares and emits a
'FLOW_CLS_DESTROY' command for each filter.

However, the stack does not do the same for chain templates and the
underlying driver never receives a 'FLOW_CLS_TMPLT_DESTROY' command when
a qdisc is deleted. This results in a memory leak [1] which can be
reproduced using [2].

Fix by introducing a 'tmplt_reoffload' operation and have the stack
invoke it with the appropriate arguments as part of the replay.
Implement the operation in the sole classifier that supports chain
templates (flower) by emitting the 'FLOW_CLS_TMPLT_{CREATE,DESTROY}'
command based on whether a flow offload callback is being bound to a
filter block or being unbound from one.

As far as I can tell, the issue happens since cited commit which
reordered tcf_block_offload_unbind() before tcf_block_flush_all_chains()
in __tcf_block_put(). The order cannot be reversed as the filter block
is expected to be freed after flushing all the chains.

[1]
unreferenced object 0xffff888107e28800 (size 2048):
  comm "tc", pid 1079, jiffies 4294958525 (age 3074.287s)
  hex dump (first 32 bytes):
    b1 a6 7c 11 81 88 ff ff e0 5b b3 10 81 88 ff ff  ..|......[......
    01 00 00 00 00 00 00 00 e0 aa b0 84 ff ff ff ff  ................
  backtrace:
    [<ffffffff81c06a68>] __kmem_cache_alloc_node+0x1e8/0x320
    [<ffffffff81ab374e>] __kmalloc+0x4e/0x90
    [<ffffffff832aec6d>] mlxsw_sp_acl_ruleset_get+0x34d/0x7a0
    [<ffffffff832bc195>] mlxsw_sp_flower_tmplt_create+0x145/0x180
    [<ffffffff832b2e1a>] mlxsw_sp_flow_block_cb+0x1ea/0x280
    [<ffffffff83a10613>] tc_setup_cb_call+0x183/0x340
    [<ffffffff83a9f85a>] fl_tmplt_create+0x3da/0x4c0
    [<ffffffff83a22435>] tc_ctl_chain+0xa15/0x1170
    [<ffffffff838a863c>] rtnetlink_rcv_msg+0x3cc/0xed0
    [<ffffffff83ac87f0>] netlink_rcv_skb+0x170/0x440
    [<ffffffff83ac6270>] netlink_unicast+0x540/0x820
    [<ffffffff83ac6e28>] netlink_sendmsg+0x8d8/0xda0
    [<ffffffff83793def>] ____sys_sendmsg+0x30f/0xa80
    [<ffffffff8379d29a>] ___sys_sendmsg+0x13a/0x1e0
    [<ffffffff8379d50c>] __sys_sendmsg+0x11c/0x1f0
    [<ffffffff843b9ce0>] do_syscall_64+0x40/0xe0
unreferenced object 0xffff88816d2c0400 (size 1024):
  comm "tc", pid 1079, jiffies 4294958525 (age 3074.287s)
  hex dump (first 32 bytes):
    40 00 00 00 00 00 00 00 57 f6 38 be 00 00 00 00  @.......W.8.....
    10 04 2c 6d 81 88 ff ff 10 04 2c 6d 81 88 ff ff  ..,m......,m....
  backtrace:
    [<ffffffff81c06a68>] __kmem_cache_alloc_node+0x1e8/0x320
    [<ffffffff81ab36c1>] __kmalloc_node+0x51/0x90
    [<ffffffff81a8ed96>] kvmalloc_node+0xa6/0x1f0
    [<ffffffff82827d03>] bucket_table_alloc.isra.0+0x83/0x460
    [<ffffffff82828d2b>] rhashtable_init+0x43b/0x7c0
    [<ffffffff832aed48>] mlxsw_sp_acl_ruleset_get+0x428/0x7a0
    [<ffffffff832bc195>] mlxsw_sp_flower_tmplt_create+0x145/0x180
    [<ffffffff832b2e1a>] mlxsw_sp_flow_block_cb+0x1ea/0x280
    [<ffffffff83a10613>] tc_setup_cb_call+0x183/0x340
    [<ffffffff83a9f85a>] fl_tmplt_create+0x3da/0x4c0
    [<ffffffff83a22435>] tc_ctl_chain+0xa15/0x1170
    [<ffffffff838a863c>] rtnetlink_rcv_msg+0x3cc/0xed0
    [<ffffffff83ac87f0>] netlink_rcv_skb+0x170/0x440
    [<ffffffff83ac6270>] netlink_unicast+0x540/0x820
    [<ffffffff83ac6e28>] netlink_sendmsg+0x8d8/0xda0
    [<ffffffff83793def>] ____sys_sendmsg+0x30f/0xa80

[2]
 # tc qdisc add dev swp1 clsact
 # tc chain add dev swp1 ingress proto ip chain 1 flower dst_ip 0.0.0.0/32
 # tc qdisc del dev swp1 clsact
 # devlink dev reload pci/0000:06:00.0

Fixes: bbf73830cd ("net: sched: traverse chains in block with tcf_get_next_chain()")
Signed-off-by: Ido Schimmel <idosch@nvidia.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-01-24 01:33:59 +00:00
..
act_api.c net/sched: simplify tc_action_load_ops parameters 2024-01-07 14:58:26 +00:00
act_bpf.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_connmark.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_csum.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_ct.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-01-09 16:23:26 +01:00
act_ctinfo.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_gact.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_gate.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_ife.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_meta_mark.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
act_meta_skbprio.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
act_meta_skbtcindex.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
act_mirred.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_mpls.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_nat.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_pedit.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_police.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_sample.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_simple.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_skbedit.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_skbmod.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_tunnel_key.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
act_vlan.c net/sched: introduce ACT_P_BOUND return code 2024-01-03 18:36:24 -08:00
cls_api.c net/sched: flower: Fix chain template offload 2024-01-24 01:33:59 +00:00
cls_basic.c net: sched: Fill in missing MODULE_DESCRIPTION for classifiers 2023-11-01 21:49:09 -07:00
cls_bpf.c net: sched: cls_bpf: Undo tcf_bind_filter in case of an error 2023-07-17 07:33:39 +01:00
cls_cgroup.c net: sched: Fill in missing MODULE_DESCRIPTION for classifiers 2023-11-01 21:49:09 -07:00
cls_flow.c treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
cls_flower.c net/sched: flower: Fix chain template offload 2024-01-24 01:33:59 +00:00
cls_fw.c net: sched: Fill in missing MODULE_DESCRIPTION for classifiers 2023-11-01 21:49:09 -07:00
cls_matchall.c net: sched: cls_matchall: Undo tcf_bind_filter in case of failure after mall_set_parms 2023-07-17 07:33:38 +01:00
cls_route.c net: sched: Fill in missing MODULE_DESCRIPTION for classifiers 2023-11-01 21:49:09 -07:00
cls_u32.c net/sched: cls_u32: replace int refcounts with proper refcounts 2023-11-18 19:38:23 -08:00
em_canid.c net: sched: kerneldoc fixes 2020-07-13 17:20:40 -07:00
em_cmp.c net: sched: fix misspellings using misspell-fixer tool 2020-11-10 17:00:28 -08:00
em_ipset.c sched: consistently handle layer3 header accesses in the presence of VLANs 2020-07-03 14:34:53 -07:00
em_ipt.c sched: consistently handle layer3 header accesses in the presence of VLANs 2020-07-03 14:34:53 -07:00
em_meta.c net: implement lockless SO_PRIORITY 2023-10-01 19:09:54 +01:00
em_nbyte.c net: sched: Return the correct errno code 2021-02-06 11:15:28 -08:00
em_text.c net: sched: em_text: fix possible memory leak in em_text_destroy() 2024-01-01 13:08:15 +00:00
em_u32.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
ematch.c net_sched: reject TCF_EM_SIMPLE case for complex ematch module 2022-12-19 09:43:18 +00:00
Kconfig bpf: Add fd-based tcx multi-prog infra with link support 2023-07-19 10:07:27 -07:00
Makefile net/sched: Retire ipt action 2024-01-02 12:41:16 +00:00
sch_api.c net: sched: move block device tracking into tcf_block_get/put_ext() 2024-01-05 11:19:08 +00:00
sch_blackhole.c Revert "net: sched: Pass root lock to Qdisc_ops.enqueue" 2020-07-16 16:48:34 -07:00
sch_cake.c net: move gso declarations and functions to their own files 2023-06-10 00:11:41 -07:00
sch_cbs.c net/sched: cbs: Use units.h instead of the copy of a definition 2023-11-30 23:15:48 -08:00
sch_choke.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_codel.c net: sched: remove redundant NULL check in change hook function 2022-09-01 08:06:45 +02:00
sch_drr.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_etf.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_ets.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_fifo.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_fq_codel.c Revert "net: sched: fq_codel: remove redundant resource cleanup in fq_codel_init()" 2022-10-19 13:47:09 +01:00
sch_fq_pie.c netlink: make range pointers in policies const 2023-10-26 16:24:09 -07:00
sch_fq.c net_sched: sch_fq: better validate TCA_FQ_WEIGHTS and TCA_FQ_PRIOMAP 2023-11-08 18:30:21 -08:00
sch_frag.c net: dst: remove unnecessary input parameter in dst_alloc and dst_init 2023-09-12 11:42:25 +02:00
sch_generic.c net: sched: move block device tracking into tcf_block_get/put_ext() 2024-01-05 11:19:08 +00:00
sch_gred.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_hfsc.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_hhf.c net: sched: remove redundant NULL check in change hook function 2022-09-01 08:06:45 +02:00
sch_htb.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_ingress.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_mq.c net: sched: add rcu annotations around qdisc->qdisc_sleeping 2023-06-07 10:25:39 +01:00
sch_mqprio_lib.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_mqprio_lib.h net/sched: mqprio: allow per-TC user input of FP adminStatus 2023-04-13 22:22:10 -07:00
sch_mqprio.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_multiq.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_netem.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_pie.c net: sched: add rcu annotations around qdisc->qdisc_sleeping 2023-06-07 10:25:39 +01:00
sch_plug.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_prio.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_qfq.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_red.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_sfb.c Networking fixes for 6.1-rc2, including fixes from netfilter 2022-10-20 17:24:59 -07:00
sch_sfq.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_skbprio.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_taprio.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_tbf.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_teql.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00