linux-stable/net
Patrick McHardy 334a8132d9 [SKBUFF]: Keep track of writable header len of headerless clones
Currently NAT (and others) that want to modify cloned skbs copy them,
even if in the vast majority of cases its not necessary because the
skb is a clone made by TCP and the portion NAT wants to modify is
actually writable because TCP release the header reference before
cloning.

The problem is that there is no clean way for NAT to find out how
long the writable header area is, so this patch introduces skb->hdr_len
to hold this length. When a headerless skb is cloned skb->hdr_len
is set to the current headroom, for regular clones it is copied from
the original. A new function skb_clone_writable(skb, len) returns
whether the skb is writable up to len bytes from skb->data. To avoid
enlarging the skb the mac_len field is reduced to 16 bit and the
new hdr_len field is put in the remaining 16 bit.

I've done a few rough benchmarks of NAT (not with this exact patch,
but a very similar one). As expected it saves huge amounts of system
time in case of sendfile, bringing it down to basically the same
amount as without NAT, with sendmsg it only helps on loopback,
probably because of the large MTU.

Transmit a 1GB file using sendfile/sendmsg over eth0/lo with and
without NAT:

- sendfile eth0, no NAT:	sys     0m0.388s
- sendfile eth0, NAT:		sys     0m1.835s
- sendfile eth0: NAT + path:	sys     0m0.370s	(~ -80%)

- sendfile lo, no NAT:		sys     0m0.258s
- sendfile lo, NAT:		sys     0m2.609s
- sendfile lo, NAT + patch:	sys     0m0.260s	(~ -90%)

- sendmsg eth0, no NAT:		sys     0m2.508s
- sendmsg eth0, NAT:		sys     0m2.539s
- sendmsg eth0, NAT + patch:	sys     0m2.445s	(no change)

- sendmsg lo, no NAT:		sys	0m2.151s
- sendmsg lo, NAT:		sys     0m3.557s
- sendmsg lo, NAT + patch:	sys     0m2.159s	(~ -40%)

I expect other users can see a similar performance improvement,
packet mangling iptables targets, ipip and ip_gre come to mind ..

Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
2007-07-10 22:15:37 -07:00
..
802 [NET]: cleanup extra semicolons 2007-04-25 22:29:24 -07:00
8021q [VLAN]: Use rtnl_link API 2007-07-10 22:15:03 -07:00
appletalk header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
atm [NET]: SPIN_LOCK_UNLOCKED cleanup in drivers/atm, net 2007-04-26 01:37:44 -07:00
ax25 [S390] Kconfig: unwanted menus for s390. 2007-05-10 15:46:07 +02:00
bluetooth Fix use-after-free oops in Bluetooth HID. 2007-07-07 12:22:37 -07:00
bridge [BRIDGE]: Round off STP perodic timers. 2007-05-31 01:23:39 -07:00
core [SKBUFF]: Keep track of writable header len of headerless clones 2007-07-10 22:15:37 -07:00
dccp [CCID3]: Fix a bug in the send time processing 2007-07-10 22:15:34 -07:00
decnet [NETLINK]: Mark netlink policies const 2007-06-07 13:40:10 -07:00
econet [SK_BUFF]: Convert skb->tail to sk_buff_data_t 2007-04-25 22:26:28 -07:00
ethernet [SK_BUFF]: Introduce skb_reset_mac_header(skb) 2007-04-25 22:24:32 -07:00
ieee80211 [PATCH] softmac: use list_for_each_entry 2007-07-08 22:16:37 -04:00
ipv4 [TCPv4]: Improve BH latency in /proc/net/tcp 2007-07-10 22:06:20 -07:00
ipv6 bonding / ipv6: no addrconf for slaves separately from master 2007-07-10 12:41:19 -04:00
ipx Fix incorrect prototype for ipxrtr_route_packet() 2007-05-17 05:25:49 -07:00
irda [IrDA]: f-timer reloading when sending rejected frames. 2007-06-08 19:15:56 -07:00
iucv Add suspend-related notifications for CPU hotplug 2007-05-09 12:30:56 -07:00
key xfrm: Add security check before flushing SAD/SPD 2007-06-07 13:42:46 -07:00
lapb [PATCH] remove many unneeded #includes of sched.h 2007-02-14 08:09:54 -08:00
llc Fix occurrences of "the the " 2007-05-09 08:57:56 +02:00
mac80211 [MAC80211]: Add support for SIOCGIWRATE ioctl 2007-07-10 22:14:07 -07:00
netfilter [SKBUFF]: Keep track of writable header len of headerless clones 2007-07-10 22:15:37 -07:00
netlabel [NetLabel]: consolidate the struct socket/sock handling to just struct sock 2007-06-08 13:33:09 -07:00
netlink [NETLINK]: Mark netlink policies const 2007-06-07 13:40:10 -07:00
netrom [NET]: Rework dev_base via list_head (v3) 2007-05-03 15:13:45 -07:00
packet [AF_PACKET]: Kill CONFIG_PACKET_SOCKET. 2007-05-31 01:23:32 -07:00
rfkill [RFKILL]: Fix check for correct rfkill allocation 2007-05-19 12:24:39 -07:00
rose [NET]: Rework dev_base via list_head (v3) 2007-05-03 15:13:45 -07:00
rxrpc [AF_RXRPC]: Return the number of bytes buffered in rxrpc_send_data() 2007-06-18 23:30:41 -07:00
sched [NET]: qdisc_restart - couple of optimizations. 2007-07-10 22:15:36 -07:00
sctp SCTP: Add scope_id validation for link-local binds 2007-07-05 17:40:15 -07:00
sunrpc sendfile: convert nfsd to splice_direct_to_actor() 2007-07-10 08:04:14 +02:00
tipc [TIPC]: Optimize stream send routine to avoid fragmentation 2007-07-10 22:06:12 -07:00
unix [AF_UNIX]: Fix stream recvmsg() race. 2007-06-07 13:40:44 -07:00
wanrouter [NET]: Fix comparisons of unsigned < 0. 2007-06-03 18:08:47 -07:00
wireless [PATCH] cfg80211: fix signed macaddress in sysfs 2007-06-11 17:47:41 -04:00
x25 header cleaning: don't include smp_lock.h when not used 2007-05-08 11:15:07 -07:00
xfrm [XFRM]: Fix MTU calculation for non-ESP SAs 2007-06-18 22:30:15 -07:00
compat.c [NET]: Adding SO_TIMESTAMPNS / SCM_TIMESTAMPNS support 2007-04-25 22:24:21 -07:00
Kconfig [S390] Kconfig: no wireless on s390. 2007-05-10 15:46:08 +02:00
Makefile [NET]: rfkill: add support for input key to control wireless radio 2007-05-07 00:34:20 -07:00
nonet.c [PATCH] Make most file operations structs in fs/ const 2006-03-28 09:16:06 -08:00
socket.c Remove SLAB_CTOR_CONSTRUCTOR 2007-05-17 05:23:04 -07:00
sysctl_net.c Remove obsolete #include <linux/config.h> 2006-06-30 19:25:36 +02:00
TUNABLE Linux-2.6.12-rc2 2005-04-16 15:20:36 -07:00