linux-next/net
Andrey Vagin dbde497966 tcp: don't update snd_nxt, when a socket is switched from repair mode
snd_nxt must be updated synchronously with sk_send_head.  Otherwise
tp->packets_out may be updated incorrectly, what may bring a kernel panic.

Here is a kernel panic from my host.
[  103.043194] BUG: unable to handle kernel NULL pointer dereference at 0000000000000048
[  103.044025] IP: [<ffffffff815aaaaf>] tcp_rearm_rto+0xcf/0x150
...
[  146.301158] Call Trace:
[  146.301158]  [<ffffffff815ab7f0>] tcp_ack+0xcc0/0x12c0

Before this panic a tcp socket was restored. This socket had sent and
unsent data in the write queue. Sent data was restored in repair mode,
then the socket was switched from reapair mode and unsent data was
restored. After that the socket was switched back into repair mode.

In that moment we had a socket where write queue looks like this:
snd_una    snd_nxt   write_seq
   |_________|________|
             |
	  sk_send_head

After a second switching from repair mode the state of socket was
changed:

snd_una          snd_nxt, write_seq
   |_________ ________|
             |
	  sk_send_head

This state is inconsistent, because snd_nxt and sk_send_head are not
synchronized.

Bellow you can find a call trace, how packets_out can be incremented
twice for one skb, if snd_nxt and sk_send_head are not synchronized.
In this case packets_out will be always positive, even when
sk_write_queue is empty.

tcp_write_wakeup
	skb = tcp_send_head(sk);
	tcp_fragment
		if (!before(tp->snd_nxt, TCP_SKB_CB(buff)->end_seq))
			tcp_adjust_pcount(sk, skb, diff);
	tcp_event_new_data_sent
		tp->packets_out += tcp_skb_pcount(skb);

I think update of snd_nxt isn't required, when a socket is switched from
repair mode.  Because it's initialized in tcp_connect_init. Then when a
write queue is restored, snd_nxt is incremented in tcp_event_new_data_sent,
so it's always is in consistent state.

I have checked, that the bug is not reproduced with this patch and
all tests about restoring tcp connections work fine.

Cc: Pavel Emelyanov <xemul@parallels.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru>
Cc: James Morris <jmorris@namei.org>
Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org>
Cc: Patrick McHardy <kaber@trash.net>
Signed-off-by: Andrey Vagin <avagin@openvz.org>
Acked-by: Pavel Emelyanov <xemul@parallels.com>
Acked-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
2013-11-19 16:14:20 -05:00
..
9p file->f_op is never NULL... 2013-10-24 23:34:54 -04:00
802 mrp: add periodictimer to allow retries when packets get lost 2013-09-23 16:53:52 -04:00
8021q vlan: Implement vlan_dev_get_egress_qos_mask as an inline. 2013-11-11 00:42:07 -05:00
appletalk net: proc_fs: trivial: print UIDs as unsigned int 2013-08-15 14:37:46 -07:00
atm net: always pass struct netdev_notifier_info to netdevice notifiers 2013-05-28 21:58:54 -07:00
ax25 ax25: cleanup a range test 2013-10-18 13:56:07 -04:00
batman-adv batman-adv: generalize batman-adv icmp packet handling 2013-10-23 17:03:47 +02:00
bluetooth Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2013-11-08 09:03:10 -05:00
bridge bridge: Fix memory leak when deleting bridge with vlan filtering enabled 2013-11-14 16:16:34 -05:00
caif caif: use pskb_put() instead of reimplementing its functionality 2013-11-07 19:28:59 -05:00
can net: 8021q/bluetooth/bridge/can/ceph: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
ceph net: 8021q/bluetooth/bridge/can/ceph: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
core macvlan: disable LRO on lower device instead of macvlan 2013-11-15 17:55:48 -05:00
dcb rtnetlink: Remove passing of attributes into rtnl_doit functions 2013-03-22 10:31:16 -04:00
dccp ipv4: introduce new IP_MTU_DISCOVER mode IP_PMTUDISC_INTERFACE 2013-11-05 21:52:27 -05:00
decnet netfilter: pass hook ops to hookfn 2013-10-14 11:29:31 +02:00
dns_resolver net: strict_strtoul is obsolete, use kstrtoul instead 2013-07-12 16:09:14 -07:00
dsa net: dsa: inherit addr_assign_type along with dev_addr 2013-09-03 20:57:49 -04:00
ethernet ethernet: use likely() for common Ethernet encap 2013-09-30 21:52:53 -07:00
hsr net/hsr: Fix possible leak in 'hsr_get_node_status()' 2013-11-14 17:26:21 -05:00
ieee802154 inet: prevent leakage of uninitialized memory to user in recv syscalls 2013-11-18 15:12:03 -05:00
ipv4 tcp: don't update snd_nxt, when a socket is switched from repair mode 2013-11-19 16:14:20 -05:00
ipv6 ipv6: Fix inet6_init() cleanup order 2013-11-18 15:38:46 -05:00
ipx net: proc_fs: trivial: print UIDs as unsigned int 2013-08-15 14:37:46 -07:00
irda genetlink: make all genl_ops users const 2013-11-14 17:10:41 -05:00
iucv net: delete __cpuinit usage from all net files 2013-07-14 19:36:58 -04:00
key xfrm: Guard IPsec anti replay window against replay bitmap 2013-09-17 12:17:10 +02:00
l2tp inet: prevent leakage of uninitialized memory to user in recv syscalls 2013-11-18 15:12:03 -05:00
lapb net/lapb: re-send packets on timeout 2013-09-23 16:52:45 -04:00
llc llc: Use normal etherdevice.h tests 2013-09-03 22:34:47 -04:00
mac80211 Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-next into for-davem 2013-11-08 09:03:10 -05:00
mac802154 6lowpan: set and use mac_len for mac header length 2013-10-30 17:18:46 -04:00
mpls ipip: add GSO/TSO support 2013-10-19 19:36:19 -04:00
netfilter genetlink: make all genl_ops users const 2013-11-14 17:10:41 -05:00
netlabel genetlink: make all genl_ops users const 2013-11-14 17:10:41 -05:00
netlink netlink: fix documentation typo in netlink_set_err() 2013-11-19 15:07:01 -05:00
netrom net: Convert uses of typedef ctl_table to struct ctl_table 2013-06-13 02:36:09 -07:00
nfc genetlink: make all genl_ops users const 2013-11-14 17:10:41 -05:00
openvswitch genetlink: make all genl_ops users const 2013-11-14 17:10:41 -05:00
packet net: packet: use reciprocal_divide in fanout_demux_hash 2013-08-29 16:43:29 -04:00
phonet inet: prevent leakage of uninitialized memory to user in recv syscalls 2013-11-18 15:12:03 -05:00
rds inet: convert inet_ehash_secret and ipv6_hash_secret to net_get_random_once 2013-10-19 19:45:35 -04:00
rfkill net: rfkill: gpio: add ACPI support 2013-10-28 15:05:25 +01:00
rose net: Convert uses of typedef ctl_table to struct ctl_table 2013-06-13 02:36:09 -07:00
rxrpc net: misc: Remove extern from function prototypes 2013-10-19 19:12:11 -04:00
sched pkt_sched: fq: fix pacing for small frames 2013-11-15 21:01:52 -05:00
sctp net: sctp: bug-fixing: retran_path not set properly after transports recovering (v3) 2013-11-14 16:35:09 -05:00
sunrpc Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next 2013-11-13 17:40:34 +09:00
tipc tipc: fix dereference before check warning 2013-11-15 03:11:06 -05:00
unix net: unix: inherit SOCK_PASS{CRED, SEC} flags from socket to fix race 2013-10-19 18:50:15 -04:00
vmw_vsock Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net 2013-08-16 15:37:26 -07:00
wimax genetlink: make all genl_ops users const 2013-11-14 17:10:41 -05:00
wireless genetlink: make all genl_ops users const 2013-11-14 17:10:41 -05:00
x25 net: x25: Fix dead URLs in Kconfig 2013-10-29 17:35:17 -04:00
xfrm net: move pskb_put() to core code 2013-11-07 19:28:58 -05:00
compat.c net: heap overflow in __audit_sockaddr() 2013-10-03 16:05:14 -04:00
Kconfig net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0) 2013-11-03 23:20:14 -05:00
Makefile net/hsr: Add support for the High-availability Seamless Redundancy protocol (HSRv0) 2013-11-03 23:20:14 -05:00
nonet.c
socket.c net: heap overflow in __audit_sockaddr() 2013-10-03 16:05:14 -04:00
sysctl_net.c net: Update the sysctl permissions handler to test effective uid/gid 2013-10-07 15:57:56 -04:00