linux-stable/net/sched
Jakub Kicinski 122aba8c80 net_sched: sch_fq: don't follow the fast path if Tx is behind now
Recent kernels cause a lot of TCP retransmissions

[ ID] Interval           Transfer     Bitrate         Retr  Cwnd
[  5]   0.00-1.00   sec  2.24 GBytes  19.2 Gbits/sec  2767    442 KBytes
[  5]   1.00-2.00   sec  2.23 GBytes  19.1 Gbits/sec  2312    350 KBytes
                                                      ^^^^

Replacing the qdisc with pfifo makes retransmissions go away.

It appears that a flow may have a delayed packet with a very near
Tx time. Later, we may get busy processing Rx and the target Tx time
will pass, but we won't service Tx since the CPU is busy with Rx.
If Rx sees an ACK and we try to push more data for the delayed flow
we may fastpath the skb, not realizing that there are already "ready
to send" packets for this flow sitting in the qdisc.

Don't trust the fastpath if we are "behind" according to the projected
Tx time for next flow waiting in the Qdisc. Because we consider anything
within the offload window to be okay for fastpath we must consider
the entire offload window as "now".

Qdisc config:

qdisc fq 8001: dev eth0 parent 1234:1 limit 10000p flow_limit 100p \
  buckets 32768 orphan_mask 1023 bands 3 \
  priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 \
  weights 589824 196608 65536 quantum 3028b initial_quantum 15140b \
  low_rate_threshold 550Kbit \
  refill_delay 40ms timer_slack 10us horizon 10s horizon_drop

For iperf this change seems to do fine, the reordering is gone.
The fastpath still gets used most of the time:

  gc 0 highprio 0 fastpath 142614 throttled 418309 latency 19.1us
   xx_behind 2731

where "xx_behind" counts how many times we hit the new "return false".

CC: stable@vger.kernel.org
Fixes: 076433bd78 ("net_sched: sch_fq: add fast path for mostly idle qdisc")
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Link: https://patch.msgid.link/20241124022148.3126719-1-kuba@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-11-28 10:11:59 +01:00
..
act_api.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-10-25 09:08:22 +02:00
act_bpf.c net: Rename mono_delivery_time to tstamp_type for scalabilty 2024-05-23 14:14:23 -07:00
act_connmark.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_csum.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_ct.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
act_ctinfo.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
act_gact.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_gate.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
act_ife.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_meta_mark.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
act_meta_skbprio.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
act_meta_skbtcindex.c treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 152 2019-05-30 11:26:32 -07:00
act_mirred.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-02-22 15:29:26 -08:00
act_mpls.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
act_nat.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_pedit.c net: sched: Annotate struct tc_pedit with __counted_by 2024-02-19 10:58:24 +00:00
act_police.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
act_sample.c net: sched: act_sample: add action cookie to sample 2024-07-05 17:45:47 -07:00
act_simple.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_skbedit.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
act_skbmod.c net/sched: act_skbmod: convert comma to semicolon 2024-07-11 17:12:15 -07:00
act_tunnel_key.c ip_tunnel: convert __be16 tunnel flags to bitmaps 2024-04-01 10:49:28 +01:00
act_vlan.c tc: adjust network header after 2nd vlan push 2024-08-27 11:37:42 +02:00
cls_api.c net: sched: cls_api: improve the error message for ID allocation failure 2024-11-12 12:58:31 +01:00
cls_basic.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
cls_bpf.c net: Rename mono_delivery_time to tstamp_type for scalabilty 2024-05-23 14:14:23 -07:00
cls_cgroup.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
cls_flow.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
cls_flower.c net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK 2024-07-15 09:14:39 -07:00
cls_fw.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
cls_matchall.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
cls_route.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
cls_u32.c net: sched: cls_u32: Fix u32's systematic failure to free IDR entries for hnodes. 2024-11-12 18:26:03 -08:00
em_canid.c net: fill in MODULE_DESCRIPTION()s for net/sched 2024-02-09 14:12:02 -08:00
em_cmp.c move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
em_ipset.c sched: consistently handle layer3 header accesses in the presence of VLANs 2020-07-03 14:34:53 -07:00
em_ipt.c sched: consistently handle layer3 header accesses in the presence of VLANs 2020-07-03 14:34:53 -07:00
em_meta.c net: fill in MODULE_DESCRIPTION()s for net/sched 2024-02-09 14:12:02 -08:00
em_nbyte.c net: fill in MODULE_DESCRIPTION()s for net/sched 2024-02-09 14:12:02 -08:00
em_text.c net: fill in MODULE_DESCRIPTION()s for net/sched 2024-02-09 14:12:02 -08:00
em_u32.c net: fill in MODULE_DESCRIPTION()s for net/sched 2024-02-09 14:12:02 -08:00
ematch.c net_sched: reject TCF_EM_SIMPLE case for complex ematch module 2022-12-19 09:43:18 +00:00
Kconfig net: sched: Remove NET_ACT_IPT from Kconfig 2024-02-13 11:24:35 +01:00
Makefile net/sched: Retire ipt action 2024-01-02 12:41:16 +00:00
sch_api.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-10-31 18:10:07 -07:00
sch_blackhole.c Revert "net: sched: Pass root lock to Qdisc_ops.enqueue" 2020-07-16 16:48:34 -07:00
sch_cake.c sch_cake: constify inverse square root cache 2024-09-10 18:31:52 -07:00
sch_cbs.c net/sched: cbs: Fix integer overflow in cbs_set_port_rate() 2024-10-15 18:25:47 -07:00
sch_choke.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
sch_codel.c net_sched: sch_codel: implement lockless codel_dump() 2024-04-19 11:34:07 +01:00
sch_drr.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
sch_etf.c net_sched: sch_tfs: implement lockless etf_dump() 2024-04-19 11:34:07 +01:00
sch_ets.c net_sched: sch_ets: implement lockless ets_dump() 2024-04-19 11:34:07 +01:00
sch_fifo.c net_sched: sch_fifo: implement lockless __fifo_dump() 2024-04-19 11:34:07 +01:00
sch_fq_codel.c net_sched: sch_fq_codel: implement lockless fq_codel_dump() 2024-04-19 11:34:07 +01:00
sch_fq_pie.c net_sched: sch_fq_pie: implement lockless fq_pie_dump() 2024-04-19 11:34:07 +01:00
sch_fq.c net_sched: sch_fq: don't follow the fast path if Tx is behind now 2024-11-28 10:11:59 +01:00
sch_frag.c net: dst: remove unnecessary input parameter in dst_alloc and dst_init 2023-09-12 11:42:25 +02:00
sch_generic.c net: fix races in netdev_tx_sent_queue()/dev_watchdog() 2024-10-21 12:54:25 +02:00
sch_gred.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
sch_hfsc.c net_sched: sch_hfsc: implement lockless accesses to q->defcls 2024-04-19 11:34:08 +01:00
sch_hhf.c net_sched: sch_hhf: implement lockless hhf_dump() 2024-04-19 11:34:08 +01:00
sch_htb.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
sch_ingress.c bpf: Fix too early release of tcx_entry 2024-07-08 14:07:31 -07:00
sch_mq.c net: sched: add rcu annotations around qdisc->qdisc_sleeping 2023-06-07 10:25:39 +01:00
sch_mqprio_lib.c net: sched: Fill in missing MODULE_DESCRIPTION for qdiscs 2023-11-01 21:49:09 -07:00
sch_mqprio_lib.h net/sched: mqprio: allow per-TC user input of FP adminStatus 2023-04-13 22:22:10 -07:00
sch_mqprio.c netlink: introduce type-checking attribute iteration 2024-03-29 15:06:02 -07:00
sch_multiq.c net: sched: sch_multiq: fix possible OOB write in multiq_tune() 2024-06-05 10:50:19 +01:00
sch_netem.c netem: Include <linux/prandom.h> in sch_netem.c 2024-10-03 18:20:31 +02:00
sch_pie.c net_sched: sch_pie: implement lockless pie_dump() 2024-04-19 11:34:08 +01:00
sch_plug.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
sch_prio.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
sch_qfq.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
sch_red.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
sch_sfb.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
sch_sfq.c net_sched: sch_sfq: handle bigger packets 2024-10-09 19:50:31 -07:00
sch_skbprio.c net_sched: sch_skbprio: implement lockless skbprio_dump() 2024-04-19 11:34:08 +01:00
sch_taprio.c net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
sch_tbf.c net/sched: Add module aliases for cls_,sch_,act_ modules 2024-02-02 10:57:55 -08:00
sch_teql.c net: annotate writes on dev->mtu from ndo_change_mtu() 2024-05-07 16:19:14 -07:00