linux-stable/net
Terin Stock d7fce52fdf ipvs: align inner_mac_header for encapsulation
When using encapsulation the original packet's headers are copied to the
inner headers. This preserves the space for an inner mac header, which
is not used by the inner payloads for the encapsulation types supported
by IPVS. If a packet is using GUE or GRE encapsulation and needs to be
segmented, flow can be passed to __skb_udp_tunnel_segment() which
calculates a negative tunnel header length. A negative tunnel header
length causes pskb_may_pull() to fail, dropping the packet.

This can be observed by attaching probes to ip_vs_in_hook(),
__dev_queue_xmit(), and __skb_udp_tunnel_segment():

    perf probe --add '__dev_queue_xmit skb->inner_mac_header \
    skb->inner_network_header skb->mac_header skb->network_header'
    perf probe --add '__skb_udp_tunnel_segment:7 tnl_hlen'
    perf probe -m ip_vs --add 'ip_vs_in_hook skb->inner_mac_header \
    skb->inner_network_header skb->mac_header skb->network_header'

These probes the headers and tunnel header length for packets which
traverse the IPVS encapsulation path. A TCP packet can be forced into
the segmentation path by being smaller than a calculated clamped MSS,
but larger than the advertised MSS.

    probe:ip_vs_in_hook: inner_mac_header=0x0 inner_network_header=0x0 mac_header=0x44 network_header=0x52
    probe:ip_vs_in_hook: inner_mac_header=0x44 inner_network_header=0x52 mac_header=0x44 network_header=0x32
    probe:dev_queue_xmit: inner_mac_header=0x44 inner_network_header=0x52 mac_header=0x44 network_header=0x32
    probe:__skb_udp_tunnel_segment_L7: tnl_hlen=-2

When using veth-based encapsulation, the interfaces are set to be
mac-less, which does not preserve space for an inner mac header. This
prevents this issue from occurring.

In our real-world testing of sending a 32KB file we observed operation
time increasing from ~75ms for veth-based encapsulation to over 1.5s
using IPVS encapsulation due to retries from dropped packets.

This changeset modifies the packet on the encapsulation path in
ip_vs_tunnel_xmit() and ip_vs_tunnel_xmit_v6() to remove the inner mac
header offset. This fixes UDP segmentation for both encapsulation types,
and corrects the inner headers for any IPIP flows that may use it.

Fixes: 84c0d5e96f ("ipvs: allow tunneling with gue encapsulation")
Signed-off-by: Terin Stock <terin@cloudflare.com>
Acked-by: Julian Anastasov <ja@ssi.bg>
Acked-by: Simon Horman <horms@kernel.org>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
2023-06-19 16:01:07 +02:00
..
6lowpan 6lowpan: Remove redundant initialisation. 2023-03-29 08:22:52 +01:00
9p Including fixes from netfilter. 2023-05-05 19:12:01 -07:00
802 treewide: Convert del_timer*() to timer_shutdown*() 2022-12-25 13:38:09 -08:00
8021q vlan: fix a potential uninit-value in vlan_dev_hard_start_xmit() 2023-05-17 12:55:39 +01:00
appletalk
atm atm: hide unused procfs functions 2023-05-17 21:27:30 -07:00
ax25 ax25: af_ax25: Remove unnecessary (void*) conversions 2022-11-16 13:31:03 +00:00
batman-adv batman-adv: Broken sync while rescheduling delayed work 2023-05-26 23:14:49 +02:00
bluetooth Bluetooth: L2CAP: Add missing checks for invalid DCID 2023-06-05 17:24:14 -07:00
bpf bpf: add test_run support for netfilter program type 2023-04-21 11:34:50 -07:00
bpfilter
bridge bridge: always declare tunnel functions 2023-05-17 21:28:58 -07:00
caif net: caif: Fix use-after-free in cfusbl_device_notify() 2023-03-02 22:22:07 -08:00
can can: j1939: avoid possible use-after-free when j1939_can_rx_register fails 2023-06-05 08:26:40 +02:00
ceph Networking changes for 6.3. 2023-02-21 18:24:12 -08:00
core bpf-for-netdev 2023-06-07 21:47:11 -07:00
dcb net: dcb: add helper functions to retrieve PCP and DSCP rewrite maps 2023-01-20 09:33:22 +00:00
dccp dccp: Print deprecation notice. 2023-06-15 15:08:59 -07:00
devlink devlink: Fix crash with CONFIG_NET_NS=n 2023-05-16 19:57:52 -07:00
dns_resolver cred: Do not default to init_cred in prepare_kernel_cred() 2022-11-01 10:04:52 -07:00
dsa net: dsa: tag_ocelot: call only the relevant portion of __skb_vlan_pop() on TX 2023-04-23 14:16:45 +01:00
ethernet net: ethernet: use sysfs_emit() to instead of scnprintf() 2022-12-07 20:02:44 -08:00
ethtool ethtool: Fix uninitialized number of lanes 2023-05-03 09:13:20 +01:00
handshake net/handshake: remove fput() that causes use-after-free 2023-06-14 22:26:37 -07:00
hsr hsr: ratelimit only when errors are printed 2023-03-16 21:11:03 -07:00
ieee802154 net: ieee802154: remove an unnecessary null pointer check 2023-03-17 09:13:53 +01:00
ife
ipv4 udplite: Print deprecation notice. 2023-06-15 15:08:58 -07:00
ipv6 udplite: Print deprecation notice. 2023-06-15 15:08:58 -07:00
iucv net/iucv: Fix size of interrupt data 2023-03-16 17:34:40 -07:00
kcm net/sock: Introduce trace_sk_data_ready() 2023-01-23 11:26:50 +00:00
key af_key: Reject optional tunnel/BEET mode templates in outbound policies 2023-05-10 07:04:51 +02:00
l2tp l2tp: generate correct module alias strings 2023-03-31 09:25:12 +01:00
l3mdev
lapb
llc net: deal with most data-races in sk_wait_event() 2023-05-10 10:03:32 +01:00
mac80211 wifi: mac80211: fragment per STA profile correctly 2023-06-12 09:52:52 +02:00
mac802154 mac802154: Rename kfree_rcu() to kvfree_rcu_mightsleep() 2023-04-05 13:48:04 +00:00
mctp mctp: remove MODULE_LICENSE in non-modules 2023-03-09 23:06:21 -08:00
mpls net: mpls: fix stale pointer if allocation fails during device rename 2023-02-15 10:26:37 +00:00
mptcp mptcp: update userspace pm infos 2023-06-05 15:15:57 +01:00
ncsi net/ncsi: clear Tx enable mode when handling a Config required AEN 2023-04-28 09:35:33 +01:00
netfilter ipvs: align inner_mac_header for encapsulation 2023-06-19 16:01:07 +02:00
netlabel netlabel: fix shift wrapping bug in netlbl_catmap_setlong() 2023-06-10 19:54:06 +01:00
netlink net/netlink: fix NETLINK_LIST_MEMBERSHIPS length report 2023-05-31 00:02:24 -07:00
netrom netrom: fix info-leak in nr_write_internal() 2023-05-25 21:02:29 -07:00
nfc nfc: change order inside nfc_se_io error path 2023-03-07 13:37:05 -08:00
nsh net: nsh: Use correct mac_offset to unwind gso skb in nsh_gso_segment() 2023-05-15 08:40:27 +01:00
openvswitch net: openvswitch: fix upcall counter access before allocation 2023-06-07 12:25:05 +01:00
packet af_packet: do not use READ_ONCE() in packet_bind() 2023-05-29 22:03:48 -07:00
phonet net/sock: Introduce trace_sk_data_ready() 2023-01-23 11:26:50 +00:00
psample genetlink: start to validate reserved header bytes 2022-08-29 12:47:15 +01:00
qrtr net: qrtr: Fix an uninit variable access bug in qrtr_tx_resume() 2023-04-13 09:35:30 +02:00
rds rds: rds_rm_zerocopy_callback() correct order for list_add_tail() 2023-02-13 09:33:39 +00:00
rfkill net: rfkill-gpio: Add explicit include for of.h 2023-04-06 20:36:27 +02:00
rose net/rose: Fix to not accept on connected socket 2023-01-28 00:19:57 -08:00
rxrpc rxrpc: Truncate UTS_RELEASE for rxrpc version 2023-05-30 10:01:06 +02:00
sched net/sched: cls_api: Fix lockup on flushing explicitly created chain 2023-06-14 23:03:16 -07:00
sctp sctp: fix an error code in sctp_sf_eat_auth() 2023-06-12 09:36:27 +01:00
smc net/smc: Avoid to access invalid RMBs' MRs in SMCRv1 ADD LINK CONT 2023-06-03 20:51:04 +01:00
strparser strparser: pad sk_skb_cb to avoid straddling cachelines 2022-07-08 18:38:44 -07:00
sunrpc nfsd-6.4 fixes: 2023-06-02 13:38:55 -04:00
switchdev net: rename reference+tracking helpers 2022-06-09 21:52:55 -07:00
tipc net: tipc: resize nlattr array to correct size 2023-06-15 14:59:17 -07:00
tls tls: improve lockless access safety of tls_err_abort() 2023-05-26 10:35:58 +01:00
unix bpf, sockmap: Pass skb ownership through read_skb 2023-05-23 16:09:47 +02:00
vmw_vsock bpf, sockmap: Pass skb ownership through read_skb 2023-05-23 16:09:47 +02:00
wireless wifi: cfg80211: remove links only on AP 2023-06-09 13:30:53 +02:00
x25 net/x25: Fix to not accept on connected socket 2023-01-25 09:51:04 +00:00
xdp bpf-next-for-netdev 2023-04-13 16:43:38 -07:00
xfrm ipsec-2023-05-16 2023-05-16 20:52:35 -07:00
compat.c net/compat: Update msg_control_is_user when setting a kernel pointer 2023-04-14 11:09:27 +01:00
devres.c
Kconfig net/handshake: Add Kunit tests for the handshake consumer API 2023-04-19 18:48:48 -07:00
Kconfig.debug net: make NET_(DEV|NS)_REFCNT_TRACKER depend on NET 2022-09-20 14:23:56 -07:00
Makefile net/handshake: Create a NETLINK service for handling handshake requests 2023-04-19 18:48:48 -07:00
socket.c net: annotate sk->sk_err write from do_recvmmsg() 2023-05-10 09:58:29 +01:00
sysctl_net.c