linux-next/net/core
Richard Gobert 5ef31ea5d0 net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb
Commits a602456 ("udp: Add GRO functions to UDP socket") and 57c67ff ("udp:
additional GRO support") introduce incorrect usage of {ip,ipv6}_hdr in the
complete phase of gro. The functions always return skb->network_header,
which in the case of encapsulated packets at the gro complete phase, is
always set to the innermost L3 of the packet. That means that calling
{ip,ipv6}_hdr for skbs which completed the GRO receive phase (both in
gro_list and *_gro_complete) when parsing an encapsulated packet's _outer_
L3/L4 may return an unexpected value.

This incorrect usage leads to a bug in GRO's UDP socket lookup.
udp{4,6}_lib_lookup_skb functions use ip_hdr/ipv6_hdr respectively. These
*_hdr functions return network_header which will point to the innermost L3,
resulting in the wrong offset being used in __udp{4,6}_lib_lookup with
encapsulated packets.

This patch adds network_offset and inner_network_offset to napi_gro_cb, and
makes sure both are set correctly.

To fix the issue, network_offsets union is used inside napi_gro_cb, in
which both the outer and the inner network offsets are saved.

Reproduction example:

Endpoint configuration example (fou + local address bind)

    # ip fou add port 6666 ipproto 4
    # ip link add name tun1 type ipip remote 2.2.2.1 local 2.2.2.2 encap fou encap-dport 5555 encap-sport 6666 mode ipip
    # ip link set tun1 up
    # ip a add 1.1.1.2/24 dev tun1

Netperf TCP_STREAM result on net-next before patch is applied:

net-next main, GRO enabled:
    $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec

    131072  16384  16384    5.28        2.37

net-next main, GRO disabled:
    $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec

    131072  16384  16384    5.01     2745.06

patch applied, GRO enabled:
    $ netperf -H 1.1.1.2 -t TCP_STREAM -l 5
    Recv   Send    Send
    Socket Socket  Message  Elapsed
    Size   Size    Size     Time     Throughput
    bytes  bytes   bytes    secs.    10^6bits/sec

    131072  16384  16384    5.01     2877.38

Fixes: a6024562ff ("udp: Add GRO functions to UDP socket")
Signed-off-by: Richard Gobert <richardbgobert@gmail.com>
Reviewed-by: Eric Dumazet <edumazet@google.com>
Reviewed-by: Willem de Bruijn <willemb@google.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-05-02 11:02:48 +02:00
..
bpf_sk_storage.c net: Namespace-ify sysctl_optmem_max 2023-12-15 11:01:27 +00:00
datagram.c net: Fix from address in memcpy_to_iter_csum() 2024-02-02 12:21:02 +00:00
dev_addr_lists_test.c net: fill in MODULE_DESCRIPTION()s under net/core 2023-10-28 11:29:27 +01:00
dev_addr_lists.c net: extract a few internals from netdevice.h 2022-04-07 20:32:09 -07:00
dev_ioctl.c net: partial revert of the "Make timestamping selectable: series 2023-11-18 18:42:37 -08:00
dev.c net/sched: Fix mirred deadlock on device recursion 2024-04-17 18:22:52 -07:00
dev.h net: move netdev_tstamp_prequeue into net_hotdata 2024-03-07 21:12:41 -08:00
drop_monitor.c genetlink: Use internal flags for multicast groups 2023-12-29 08:43:59 +00:00
dst_cache.c wireguard: device: reset peer src endpoint when netns exits 2021-11-29 19:50:45 -08:00
dst.c net: dst: Make dst_destroy() static and return void. 2024-02-06 11:45:53 +01:00
failover.c net: failover: use IFF_NO_ADDRCONF flag to prevent ipv6 addrconf 2022-12-12 15:18:25 -08:00
fib_notifier.c net: fib_notifier: propagate extack down to the notifier block callback 2019-10-04 11:10:56 -07:00
fib_rules.c fib: rules: remove repeated assignment in fib_nl2rule 2024-01-07 15:16:19 +00:00
filter.c xdp: use flags field to disambiguate broadcast redirect 2024-04-22 10:24:41 -07:00
flow_dissector.c net/core: Fix ETH_P_1588 flow dissector 2023-09-15 10:40:04 +01:00
flow_offload.c tc: flower: Enable offload support IPSEC SPI field. 2023-08-02 10:09:32 +01:00
gen_estimator.c treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
gen_stats.c net: Remove the obsolte u64_stats_fetch_*_irq() users (net). 2022-10-28 20:13:54 -07:00
gro_cells.c net: move netdev_max_backlog to net_hotdata 2024-03-07 21:12:42 -08:00
gro.c net: gro: fix udp bad offset in socket lookup by adding {inner_}network_offset to napi_gro_cb 2024-05-02 11:02:48 +02:00
gso_test.c net: test: Fix printf format specifier in skb_segment kunit test 2024-02-27 16:27:17 -07:00
gso.c net: introduce struct net_hotdata 2024-03-07 21:12:41 -08:00
hotdata.c net: move dev_rx_weight to net_hotdata 2024-03-07 21:12:42 -08:00
hwbm.c net: hwbm: Make the hwbm_pool lock a mutex 2019-06-09 19:40:10 -07:00
link_watch.c net: add netdev_set_operstate() helper 2024-02-14 11:20:13 +00:00
lwt_bpf.c lwt: Fix return values of BPF xmit ops 2023-08-18 16:05:26 +02:00
lwtunnel.c xfrm: lwtunnel: squelch kernel warning in case XFRM encap type is not available 2022-10-12 10:45:51 +02:00
Makefile net: introduce struct net_hotdata 2024-03-07 21:12:41 -08:00
neighbour.c neighbour: Don't let neigh_forced_gc() disable preemption for long 2023-12-08 10:37:43 +00:00
net_namespace.c net: use synchronize_rcu_expedited in cleanup_net() 2024-02-12 12:17:03 +00:00
net-procfs.c net: move ptype_all into net_hotdata 2024-03-07 21:12:41 -08:00
net-sysfs.c net: dqs: add NIC stall detector based on BQL 2024-03-08 10:23:26 +00:00
net-sysfs.h net-sysfs: add netdev_change_owner() 2020-02-26 20:07:25 -08:00
net-traces.c udp6: add a missing call into udp_fail_queue_rcv_skb tracepoint 2023-07-07 09:16:52 +01:00
netclassid_cgroup.c cgroup, netclassid: on modifying netclassid in cgroup, only consider the main process. 2023-10-16 16:36:53 -07:00
netdev-genl-gen.c netdev: add per-queue statistics 2024-03-07 21:13:25 -08:00
netdev-genl-gen.h netdev: add per-queue statistics 2024-03-07 21:13:25 -08:00
netdev-genl.c netdev: add queue stat for alloc failures 2024-03-07 21:13:26 -08:00
netevent.c net: core: Correct function name netevent_unregister_notifier() in the kerneldoc 2021-03-28 17:56:56 -07:00
netpoll.c netpoll: allocate netdev tracker right away 2023-06-15 08:21:11 +01:00
netprio_cgroup.c bpf, cgroups: Fix cgroup v2 fallback on v1/v2 mixed mode 2021-09-13 16:35:58 -07:00
of_net.c net: Explicitly include correct DT includes 2023-07-27 20:33:16 -07:00
page_pool_priv.h net: page_pool: report when page pool was destroyed 2023-11-28 15:48:39 +01:00
page_pool_user.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-03-07 10:29:36 -08:00
page_pool.c net: page_pool: factor out page_pool recycle check 2024-03-11 13:01:15 -07:00
pktgen.c net: pktgen: Use wait_event_freezable_timeout() for freezable kthread 2023-12-27 14:34:52 +00:00
ptp_classifier.c ptp: Add generic PTP is_sync() function 2022-03-07 11:31:34 +00:00
request_sock.c tcp: make sure init the accept_queue's spinlocks once 2024-01-19 21:13:25 -08:00
rtnetlink.c netlink: let core handle error cases in dump operations 2024-03-07 20:48:22 -08:00
scm.c af_unix: Try to run GC async. 2024-01-26 20:34:25 -08:00
secure_seq.c tcp: Fix data-races around sysctl knobs related to SYN option. 2022-07-20 10:14:49 +01:00
selftests.c net: fill in MODULE_DESCRIPTION()s under net/core 2023-10-28 11:29:27 +01:00
skbuff.c net: core: reject skb_copy(_expand) for fraglist GSO skbs 2024-05-01 11:44:10 +01:00
skmsg.c bpf, skmsg: Fix NULL pointer dereference in sk_psock_skb_ingress_enqueue 2024-04-08 09:18:22 +02:00
sock_destructor.h skb_expand_head() adjust skb->truesize incorrectly 2021-10-22 12:35:51 -07:00
sock_diag.c sock_diag: remove sock_diag_mutex 2024-01-23 15:13:55 +01:00
sock_map.c bpf, sockmap: Prevent lock inversion deadlock in map delete elem 2024-04-02 16:31:05 +02:00
sock_reuseport.c soreuseport: Fix socket selection for SO_INCOMING_CPU. 2022-10-25 11:35:16 +02:00
sock.c net: mark racy access on sk->sk_rcvbuf 2024-03-25 14:46:59 +00:00
stream.c net: Return error from sk_stream_wait_connect() if sk_wait_event() fails 2023-12-15 10:48:51 +00:00
sysctl_net_core.c net: move rps_sock_flow_table to net_hotdata 2024-03-07 21:12:43 -08:00
timestamping.c net: partial revert of the "Make timestamping selectable: series 2023-11-18 18:42:37 -08:00
tso.c net: tso: inline tso_count_descs() 2022-12-12 15:04:39 -08:00
utils.c net: core: inet[46]_pton strlen len types 2022-11-01 21:14:39 -07:00
xdp.c net: move skbuff_cache(s) to net_hotdata 2024-03-07 21:12:42 -08:00