linux/net
Philo Lu 78c91ae2c6 ipv4/udp: Add 4-tuple hash for connected socket
Currently, the udp_table has two hash table, the port hash and portaddr
hash. Usually for UDP servers, all sockets have the same local port and
addr, so they are all on the same hash slot within a reuseport group.

In some applications, UDP servers use connect() to manage clients. In
particular, when firstly receiving from an unseen 4 tuple, a new socket
is created and connect()ed to the remote addr:port, and then the fd is
used exclusively by the client.

Once there are connected sks in a reuseport group, udp has to score all
sks in the same hash2 slot to find the best match. This could be
inefficient with a large number of connections, resulting in high
softirq overhead.

To solve the problem, this patch implement 4-tuple hash for connected
udp sockets. During connect(), hash4 slot is updated, as well as a
corresponding counter, hash4_cnt, in hslot2. In __udp4_lib_lookup(),
hslot4 will be searched firstly if the counter is non-zero. Otherwise,
hslot2 is used like before. Note that only connected sockets enter this
hash4 path, while un-connected ones are not affected.

hlist_nulls is used for hash4, because we probably move to another hslot
wrongly when lookup with concurrent rehash. Then we check nulls at the
list end to see if we should restart lookup. Because udp does not use
SLAB_TYPESAFE_BY_RCU, we don't need to touch sk_refcnt when lookup.

Stress test results (with 1 cpu fully used) are shown below, in pps:
(1) _un-connected_ socket as server
    [a] w/o hash4: 1,825176
    [b] w/  hash4: 1,831750 (+0.36%)

(2) 500 _connected_ sockets as server
    [c] w/o hash4:   290860 (only 16% of [a])
    [d] w/  hash4: 1,889658 (+3.1% compared with [b])

With hash4, compute_score is skipped when lookup, so [d] is slightly
better than [b].

Co-developed-by: Cambda Zhu <cambda@linux.alibaba.com>
Signed-off-by: Cambda Zhu <cambda@linux.alibaba.com>
Co-developed-by: Fred Chen <fred.cc@alibaba-inc.com>
Signed-off-by: Fred Chen <fred.cc@alibaba-inc.com>
Co-developed-by: Yubing Qiu <yubing.qiuyubing@alibaba-inc.com>
Signed-off-by: Yubing Qiu <yubing.qiuyubing@alibaba-inc.com>
Signed-off-by: Philo Lu <lulie@linux.alibaba.com>
Acked-by: Willem de Bruijn <willemb@google.com>
Acked-by: Paolo Abeni <pabeni@redhat.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2024-11-18 11:56:21 +00:00
..
6lowpan ipv6: eliminate ndisc_ops_is_useropt() 2024-08-12 17:23:57 -07:00
9p 9p: fix slab cache name creation for real 2024-10-21 15:41:29 -07:00
802 move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
8021q net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
appletalk appletalk: Remove deadcode 2024-10-04 12:42:32 +01:00
atm atm: clean up a put_user() calls 2024-06-14 19:08:50 -07:00
ax25 ax25: Replace kfree() in ax25_dev_free() with ax25_dev_put() 2024-06-01 15:49:42 -07:00
batman-adv This cleanup patchset includes the following patches: 2024-10-15 15:28:17 +02:00
bluetooth Bluetooth: MGMT: Add initial implementation of MGMT_OP_HCI_CMD_SYNC 2024-11-14 15:41:31 -05:00
bpf bpf, test_run: Fix LIVE_FRAME frame update after a page has been recycled 2024-10-31 16:15:21 +01:00
bridge ndo_fdb_del: Add a parameter to report whether notification was sent 2024-11-15 16:39:18 -08:00
caif caif: Remove unused cfsrvl_getphyid 2024-10-08 15:33:49 -07:00
can can: gw: Use rtnl_register_many(). 2024-10-15 18:52:26 -07:00
ceph libceph: use min() to simplify code in ceph_dns_resolve_name() 2024-08-27 09:30:16 +02:00
core ndo_fdb_del: Add a parameter to report whether notification was sent 2024-11-15 16:39:18 -08:00
dcb dcb: Use rtnl_register_many(). 2024-10-15 18:52:26 -07:00
dccp net: fix data-races around sk->sk_forward_alloc 2024-11-11 15:29:33 -08:00
devlink net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
dns_resolver
dsa net: dsa: use ethtool string helpers 2024-11-03 10:36:34 -08:00
ethernet netkit: Fix pkt_type override upon netkit pass verdict 2024-05-25 10:48:57 -07:00
ethtool net: ethtool: account for RSS+RXNFC add semantics when checking channel count 2024-11-14 19:53:42 -08:00
handshake net/handshake: use sockfd_put() helper 2024-08-27 16:09:25 -07:00
hsr net: hsr: Add VLAN CTAG filter support 2024-11-11 16:40:44 -08:00
ieee802154 net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
ife
ipv4 ipv4/udp: Add 4-tuple hash for connected socket 2024-11-18 11:56:21 +00:00
ipv6 ipv4/udp: Add 4-tuple hash for connected socket 2024-11-18 11:56:21 +00:00
iucv s390/iucv: Fix vargs handling in iucv_alloc_device() 2024-08-22 13:09:20 -07:00
kcm kcm: replace call_rcu by kfree_rcu for simple kmem_cache_free callback 2024-10-15 10:50:21 -07:00
key xfrm: Add support for per cpu xfrm state handling. 2024-10-29 11:56:00 +01:00
l2tp genetlink: hold RCU in genlmsg_mcast() 2024-10-15 17:52:58 -07:00
l3mdev
lapb
llc llc: Constify struct llc_sap_state_trans 2024-07-15 08:51:19 -07:00
mac80211 wireless-next patches for v6.13 2024-11-13 18:35:19 -08:00
mac802154 Including fixes from ieee802154, bluetooth and netfilter. 2024-10-03 09:44:00 -07:00
mctp net: mctp: Expose transport binding identifier via IFLA attribute 2024-11-09 09:04:54 -08:00
mpls rtnetlink: Return int from rtnl_af_register(). 2024-10-22 11:02:05 +02:00
mptcp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-11-14 11:29:15 -08:00
ncsi net/ncsi: Disable the ncsi work before freeing the associated structure 2024-10-03 10:14:14 +02:00
netfilter netfilter: bitwise: add support for doing AND, OR and XOR directly 2024-11-15 12:07:04 +01:00
netlabel net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
netlink net/netlink: Correct the comment on netlink message max cap 2024-11-15 16:14:16 -08:00
netrom net/netrom: prefer strscpy over strcpy 2024-08-29 12:33:07 -07:00
nfc net: nfc: Propagate ISO14443 type A target ATS to userspace via netlink 2024-11-07 10:21:58 +01:00
nsh nsh: Restore skb->{protocol,data,mac_header} for outer header in nsh_gso_segment(). 2024-04-26 12:20:01 +02:00
openvswitch net: convert to nla_get_*_default() 2024-11-11 10:32:06 -08:00
packet af_packet: avoid erroring out after sock_init_data() in packet_create() 2024-10-15 18:43:07 -07:00
phonet phonet: do not call synchronize_rcu() from phonet_route_del() 2024-11-07 20:34:16 -08:00
psample net: psample: fix flag being set in wrong skb 2024-07-11 18:11:31 -07:00
qrtr net: qrtr: Update packets cloning when broadcasting 2024-09-24 10:48:16 +02:00
rds net/rds: remove unused struct 'rds_ib_dereg_odp_mr' 2024-10-03 16:42:52 -07:00
rfkill net: rfkill: gpio: Add check for clk_enable() 2024-11-12 13:30:31 +01:00
rose net: change proto and proto_ops accept type 2024-05-13 18:19:09 -06:00
rxrpc rxrpc: Add a tracepoint for aborts being proposed 2024-11-11 15:27:46 -08:00
sched Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-11-14 11:29:15 -08:00
sctp Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-11-14 11:29:15 -08:00
shaper net-shapers: implement cap validation in the core 2024-10-10 08:30:23 -07:00
smc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-11-07 13:44:16 -08:00
strparser
sunrpc mm: page_frag: avoid caller accessing 'page_frag_cache' directly 2024-11-11 10:56:27 -08:00
switchdev net: bridge: switchdev: Improve error message for port_obj_add/del functions 2024-05-08 12:19:12 +01:00
tipc Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-09-15 09:13:19 -07:00
tls move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
unix af_unix: Don't return OOB skb in manage_oob(). 2024-09-09 17:14:27 -07:00
vmw_vsock Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net 2024-11-14 11:29:15 -08:00
wireless wireless-next patches for v6.13 2024-11-13 18:35:19 -08:00
x25 net: change proto and proto_ops accept type 2024-05-13 18:19:09 -06:00
xdp xsk: Use xsk_buff_pool directly for cq functions 2024-10-14 17:23:49 +02:00
xfrm ipsec-next-2024-11-15 2024-11-18 11:52:49 +00:00
compat.c
devres.c
Kconfig netlink: spec: add shaper YAML spec 2024-10-10 08:30:21 -07:00
Kconfig.debug rtnetlink: Add per-netns RTNL. 2024-10-08 15:16:59 +02:00
Makefile netlink: spec: add shaper YAML spec 2024-10-10 08:30:21 -07:00
socket.c socket: Print pf->create() when it does not clear sock->sk on failure. 2024-10-29 16:31:23 -07:00
sysctl_net.c sysctl: Remove check for sentinel element in ctl_table arrays 2024-06-13 10:50:52 +02:00