linux-next

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git synced 2025-01-18 14:25:25 +00:00

Author	SHA1	Message	Date
David S. Miller	33f8a0458b	wireless-drivers-next patches for 4.10 Major changes: iwlwifi * finalize and enable dynamic queue allocation * use dev_coredumpmsg() to prevent locking the driver * small fix to pass the AID to the FW * use FW PS decisions with multi-queue ath9k * add device tree bindings * switch to use mac80211 intermediate software queues to reduce latency and fix bufferbloat wl18xx * allow scanning in AP mode -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) iQEcBAABAgAGBQJYOAC4AAoJEG4XJFUm622bUYkH/3SSYp6moSdKpVnVPx7ST7yK t9WHR9IMZFIhD6vq8AK6+8OQr1TgGjHfPu+WZj7CIl8nu53kcgPRi51gg1mndbNg 9N3RbVp06nGbM2VnW8ZIpg3OLIXatZ4c9g3LFvvtyobYvWGJ6W4D79JdlmTG1ELr XAjInbxFsgon+CwqCMOaAJx8xYp42rBnPRZZvhOq9O33kRw8Umo9UQw0s1U2Vfgx prxQ6d0GxNAPEe8QiDw/vtBcXWFMOhQeDl8sK70ZcojSn1FY730NsIh/Y86PcQTK 6TsvOL5gg+rd0ln8TZRAslnDrZBAhTEDqUzLQMRJ9VjEj5RFd8eLCSIzHfaroI8= =4qCH -----END PGP SIGNATURE----- Merge tag 'wireless-drivers-next-for-davem-2016-11-25' of git://git.kernel.org/pub/scm/linux/kernel/git/kvalo/wireless-drivers-next Kalle Valo says: ==================== wireless-drivers-next patches for 4.10 Major changes: iwlwifi * finalize and enable dynamic queue allocation * use dev_coredumpmsg() to prevent locking the driver * small fix to pass the AID to the FW * use FW PS decisions with multi-queue ath9k * add device tree bindings * switch to use mac80211 intermediate software queues to reduce latency and fix bufferbloat wl18xx * allow scanning in AP mode ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-27 20:26:59 -05:00
David S. Miller	8eb4adf60b	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec Steffen Klassert says: ==================== pull request (net): ipsec 2016-11-25 1) Fix a refcount leak in vti6. From Nicolas Dichtel. 2) Fix a wrong if statement in xfrm_sk_policy_lookup. From Florian Westphal. 3) The flowcache watermarks are per cpu. Take this into account when comparing to the threshold where we refusing new allocations. From Miroslav Urbanek. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-27 20:21:48 -05:00
Johan Hovold	fd05d7b18c	net: dsa: fix fixed-link-phy device leaks Make sure to drop the reference taken by of_phy_find_device() when registering and deregistering the fixed-link PHY-device. Fixes: 39b0c705195e ("net: dsa: Allow configuration of CPU & DSA port speeds/duplex") Signed-off-by: Johan Hovold <johan@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-27 20:01:15 -05:00
David S. Miller	0b42f25d2f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net udplite conflict is resolved by taking what 'net-next' did which removed the backlog receive method assignment, since it is no longer necessary. Two entries were added to the non-priv ethtool operations switch statement, one in 'net' and one in 'net-next, so simple overlapping changes. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-26 23:42:21 -05:00
Jon Paul Maloy	6998cc6ec2	tipc: resolve connection flow control compatibility problem In commit 10724cc7bb78 ("tipc: redesign connection-level flow control") we replaced the previous message based flow control with one based on 1k blocks. In order to ensure backwards compatibility the mechanism falls back to using message as base unit when it senses that the peer doesn't support the new algorithm. The default flow control window, i.e., how many units can be sent before the sender blocks and waits for an acknowledge (aka advertisement) is 512. This was tested against the previous version, which uses an acknowledge frequency of on ack per 256 received message, and found to work fine. However, we missed the fact that versions older than Linux 3.15 use an acknowledge frequency of 512, which is exactly the limit where a 4.6+ sender will stop and wait for acknowledge. This would also work fine if it weren't for the fact that if the first sent message on a 4.6+ server side is an empty SYNACK, this one is also is counted as a sent message, while it is not counted as a received message on a legacy 3.15-receiver. This leads to the sender always being one step ahead of the receiver, a scenario causing the sender to block after 512 sent messages, while the receiver only has registered 511 read messages. Hence, the legacy receiver is not trigged to send an acknowledge, with a permanently blocked sender as result. We solve this deadlock by simply allowing the sender to send one more message before it blocks, i.e., by a making minimal change to the condition used for determining connection congestion. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 21:38:16 -05:00
Miroslav Lichvar	8006f6bf5e	net: ethtool: don't require CAP_NET_ADMIN for ETHTOOL_GLINKSETTINGS The ETHTOOL_GLINKSETTINGS command is deprecating the ETHTOOL_GSET command and likewise it shouldn't require the CAP_NET_ADMIN capability. Signed-off-by: Miroslav Lichvar <mlichvar@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:23:30 -05:00
Jon Paul Maloy	d876a4d2af	tipc: improve sanity check for received domain records In commit 35c55c9877f8 ("tipc: add neighbor monitoring framework") we added a data area to the link monitor STATE messages under the assumption that previous versions did not use any such data area. For versions older than Linux 4.3 this assumption is not correct. In those version, all STATE messages sent out from a node inadvertently contain a 16 byte data area containing a string; -a leftover from previous RESET messages which were using this during the setup phase. This string serves no purpose in STATE messages, and should no be there. Unfortunately, this data area is delivered to the link monitor framework, where a sanity check catches that it is not a correct domain record, and drops it. It also issues a rate limited warning about the event. Since such events occur much more frequently than anticipated, we now choose to remove the warning in order to not fill the kernel log with useless contents. We also make the sanity check stricter, to further reduce the risk that such data is inavertently admitted. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:06:18 -05:00
Jon Paul Maloy	f79675563a	tipc: fix compatibility bug in link monitoring commit 817298102b0b ("tipc: fix link priority propagation") introduced a compatibility problem between TIPC versions newer than Linux 4.6 and those older than Linux 4.4. In versions later than 4.4, link STATE messages only contain a non-zero link priority value when the sender wants the receiver to change its priority. This has the effect that the receiver resets itself in order to apply the new priority. This works well, and is consistent with the said commit. However, in versions older than 4.4 a valid link priority is present in all sent link STATE messages, leading to cyclic link establishment and reset on the 4.6+ node. We fix this by adding a test that the received value should not only be valid, but also differ from the current value in order to cause the receiving link endpoint to reset. Reported-by: Amar Nv <amar.nv005@gmail.com> Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 20:06:18 -05:00
Eric Dumazet	f52dffe049	net: properly flush delay-freed skbs Typical NAPI drivers use napi_consume_skb(skb) at TX completion time. This put skb in a percpu special queue, napi_alloc_cache, to get bulk frees. It turns out the queue is not flushed and hits the NAPI_SKB_CACHE_SIZE limit quite often, with skbs that were queued hundreds of usec earlier. I measured this can take ~6000 nsec to perform one flush. __kfree_skb_flush() can be called from two points right now : 1) From net_tx_action(), but only for skbs that were queued to sd->completion_queue. -> Irrelevant for NAPI drivers in normal operation. 2) From net_rx_action(), but only under high stress or if RPS/RFS has a pending action. This patch changes net_rx_action() to perform the flush in all cases and after more urgent operations happened (like kicking remote CPUS for RPS/RFS). Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 19:37:49 -05:00
Daniel Mack	33b486793c	net: ipv4, ipv6: run cgroup eBPF egress programs If the cgroup associated with the receiving socket has an eBPF programs installed, run them from ip_output(), ip6_output() and ip_mc_output(). From mentioned functions we have two socket contexts as per 7026b1ddb6b8 ("netfilter: Pass socket pointer down through okfn()."). We explicitly need to use sk instead of skb->sk here, since otherwise the same program would run multiple times on egress when encap devices are involved, which is not desired in our case. eBPF programs used in this context are expected to either return 1 to let the packet pass, or != 1 to drop them. The programs have access to the skb through bpf_skb_load_bytes(), and the payload starts at the network headers (L3). Note that cgroup_bpf_run_filter() is stubbed out as static inline nop for !CONFIG_CGROUP_BPF, and is otherwise guarded by a static key if the feature is unused. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:26:04 -05:00
Daniel Mack	c11cd3a6ec	net: filter: run cgroup eBPF ingress programs If the cgroup associated with the receiving socket has an eBPF programs installed, run them from sk_filter_trim_cap(). eBPF programs used in this context are expected to either return 1 to let the packet pass, or != 1 to drop them. The programs have access to the skb through bpf_skb_load_bytes(), and the payload starts at the network headers (L3). Note that cgroup_bpf_run_filter() is stubbed out as static inline nop for !CONFIG_CGROUP_BPF, and is otherwise guarded by a static key if the feature is unused. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:26:04 -05:00
Daniel Mack	0e33661de4	bpf: add new prog type for cgroup socket filtering This program type is similar to BPF_PROG_TYPE_SOCKET_FILTER, except that it does not allow BPF_LD_[ABS\|IND] instructions and hooks up the bpf_skb_load_bytes() helper. Programs of this type will be attached to cgroups for network filtering and accounting. Signed-off-by: Daniel Mack <daniel@zonque.org> Acked-by: Alexei Starovoitov <ast@kernel.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:25:52 -05:00
David S. Miller	fb09c8c524	linux-can-fixes-for-4.9-20161123 -----BEGIN PGP SIGNATURE----- iQFHBAABCgAxFiEES2FAuYbJvAGobdVQPTuqJaypJWoFAlg1ppETHG1rbEBwZW5n dXRyb25peC5kZQAKCRA9O6olrKklaqieB/4o9PaD8Rj38Piy7lJW1ahAxoUnY4AA Vu1eFvtidUswO5RV6mDOuqhTulzrMcPQZguW/S7eLZh6hWYVVLlgkrNLj/RpMXsH rqGRC/sL5ICL1q/ijYK6NJJ3+GFQhl92gG+wJxsQfETWVDKH13N3sWcEyBh0+C5P lnPFNVDVSy4bpkEgXAN/sfAvoHzW//34cnxTzlsd1COAWlxZ+HHgBAGp4kaYTpbF Vz3kuNPfDI7U+36quE8SUXe/R9HfqQBtfbFtaxha8vqH8Fw6MJYO0BUJVmtawTDq nBFvB/x+d0n1YeOgo5UD5bW9thItF57GEscWqYpTuhZ0jlPr5+CZeo14 =FImK -----END PGP SIGNATURE----- Merge tag 'linux-can-fixes-for-4.9-20161123' of git://git.kernel.org/pub/scm/linux/kernel/git/mkl/linux-can Marc Kleine-Budde says: ==================== pull-request: can 2016-11-23 this is a pull request for net/master. The patch by Oliver Hartkopp for the broadcast manager (bcm) fixes the CAN-FD support, which may cause an out-of-bounds access otherwise. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-25 16:17:12 -05:00
David S. Miller	f9e154a0e6	Merge branch 'for-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth Johan Hedberg says: ==================== pull request: bluetooth 2016-11-23 Sorry about the late pull request for 4.9, but we have one more important Bluetooth patch that should make it to the release. It fixes connection creation for Bluetooth LE controllers that do not have a public address (only a random one). Please let me know if there are any issues pulling. Thanks. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-24 16:24:20 -05:00
Roman Mashak	19a8bb28d1	net sched filters: fix filter handle ID in tfilter_notify_chain() Should pass valid filter handle, not the netlink flags. Fixes: 30a391a13ab92 ("net sched filters: pass netlink message flags in event notification") Signed-off-by: Roman Mashak <mrv@mojatatu.com> Signed-off-by: Jamal Hadi Salim <jhs@mojatatu.com> Reported-by: Cong Wang <xiyou.wangcong@gmail.com> Acked-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-24 16:05:58 -05:00
Florian Fainelli	4b65246b42	ethtool: Protect {get, set}_phy_tunable with PHY device mutex PHY drivers should be able to rely on the caller of {get,set}_tunable to have acquired the PHY device mutex, in order to both serialize against concurrent calls of these functions, but also against PHY state machine changes. All ethtool PHY-level functions do this, except {get,set}_tunable, so we make them consistent here as well. We need to update the Microsemi PHY driver in the same commit to avoid introducing either deadlocks, or lack of proper locking. Fixes: 968ad9da7e0e ("ethtool: Implements ETHTOOL_PHY_GTUNABLE/ETHTOOL_PHY_STUNABLE") Fixes: 310d9ad57ae0 ("net: phy: Add downshift get/set support in Microsemi PHYs driver") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Andrew Lunn <andrew@lunn.ch> Reviewed-by: Allan W. Nielsen <allan.nielsen@microsemi.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-24 16:02:32 -05:00
Roi Dayan	59bfde01fa	devlink: Add E-Switch inline mode control Some HWs need the VF driver to put part of the packet headers on the TX descriptor so the e-switch can do proper matching and steering. The supported modes: none, link, network, transport. Signed-off-by: Roi Dayan <roid@mellanox.com> Reviewed-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-24 16:01:14 -05:00
Or Gerlitz	3df5b3c675	net: Add net-device param to the get offloaded stats ndo Some drivers would need to check few internal matters for that. To be used in downstream mlx5 commit. Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-24 16:01:14 -05:00
Eric Dumazet	f8071cde78	tcp: enhance tcp_collapse_retrans() with skb_shift() In commit 2331ccc5b323 ("tcp: enhance tcp collapsing"), we made a first step allowing copying right skb to left skb head. Since all skbs in socket write queue are headless (but possibly the very first one), this strategy often does not work. This patch extends tcp_collapse_retrans() to perform frag shifting, thanks to skb_shift() helper. This helper needs to not BUG on non headless skbs, as callers are ok with that. Tested: Following packetdrill test now passes : 0.000 socket(..., SOCK_STREAM, IPPROTO_TCP) = 3 +0 setsockopt(3, SOL_SOCKET, SO_REUSEADDR, [1], 4) = 0 +0 bind(3, ..., ...) = 0 +0 listen(3, 1) = 0 +0 < S 0:0(0) win 32792 <mss 1460,sackOK,nop,nop,nop,wscale 8> +0 > S. 0:0(0) ack 1 <mss 1460,nop,nop,sackOK,nop,wscale 8> +.100 < . 1:1(0) ack 1 win 257 +0 accept(3, ..., ...) = 4 +0 setsockopt(4, SOL_TCP, TCP_NODELAY, [1], 4) = 0 +0 write(4, ..., 200) = 200 +0 > P. 1:201(200) ack 1 +.001 write(4, ..., 200) = 200 +0 > P. 201:401(200) ack 1 +.001 write(4, ..., 200) = 200 +0 > P. 401:601(200) ack 1 +.001 write(4, ..., 200) = 200 +0 > P. 601:801(200) ack 1 +.001 write(4, ..., 200) = 200 +0 > P. 801:1001(200) ack 1 +.001 write(4, ..., 100) = 100 +0 > P. 1001:1101(100) ack 1 +.001 write(4, ..., 100) = 100 +0 > P. 1101:1201(100) ack 1 +.001 write(4, ..., 100) = 100 +0 > P. 1201:1301(100) ack 1 +.001 write(4, ..., 100) = 100 +0 > P. 1301:1401(100) ack 1 +.099 < . 1:1(0) ack 201 win 257 +.001 < . 1:1(0) ack 201 win 257 <nop,nop,sack 1001:1401> +0 > P. 201:1001(800) ack 1 Signed-off-by: Eric Dumazet <edumazet@google.com> Cc: Neal Cardwell <ncardwell@google.com> Cc: Yuchung Cheng <ycheng@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-24 15:40:42 -05:00
Eric Dumazet	30c7be26fd	udplite: call proper backlog handlers In commits 93821778def10 ("udp: Fix rcv socket locking") and f7ad74fef3af ("net/ipv6/udp: UDP encapsulation: break backlog_rcv into __udpv6_queue_rcv_skb") UDP backlog handlers were renamed, but UDPlite was forgotten. This leads to crashes if UDPlite header is pulled twice, which happens starting from commit e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Bug found by syzkaller team, thanks a lot guys ! Note that backlog use in UDP/UDPlite is scheduled to be removed starting from linux-4.10, so this patch is only needed up to linux-4.9 Fixes: 93821778def1 ("udp: Fix rcv socket locking") Fixes: f7ad74fef3af ("net/ipv6/udp: UDP encapsulation: break backlog_rcv into __udpv6_queue_rcv_skb") Fixes: e6afc8ace6dd ("udp: remove headers from UDP packets before queueing") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Cc: Benjamin LaHaise <bcrl@kvack.org> Cc: Herbert Xu <herbert@gondor.apana.org.au> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-24 15:32:14 -05:00
Paolo Abeni	764d3be6e4	ipv6: bump genid when the IFA_F_TENTATIVE flag is clear When an ipv6 address has the tentative flag set, it can't be used as source for egress traffic, while the associated route, if any, can be looked up and even stored into some dst_cache. In the latter scenario, the source ipv6 address selected and stored in the cache is most probably wrong (e.g. with link-local scope) and the entity using the dst_cache will experience lack of ipv6 connectivity until said cache is cleared or invalidated. Overall this may cause lack of connectivity over most IPv6 tunnels (comprising geneve and vxlan), if the first egress packet reaches the tunnel before the DaD is completed for the used ipv6 address. This patch bumps a new genid after that the IFA_F_TENTATIVE flag is cleared, so that dst_cache will be invalidated on next lookup and ipv6 connectivity restored. Fixes: 0c1d70af924b ("net: use dst_cache for vxlan device") Fixes: 468dfffcd762 ("geneve: add dst caching support") Acked-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-24 12:04:10 -05:00
Stefan Hajnoczi	b911682318	VSOCK: add loopback to virtio_transport The VMware VMCI transport supports loopback inside virtual machines. This patch implements loopback for virtio-vsock. Flow control is handled by the virtio-vsock protocol as usual. The sending process stops transmitting on a connection when the peer's receive buffer space is exhausted. Cathy Avery <cavery@redhat.com> noticed this difference between VMCI and virtio-vsock when a test case using loopback failed. Although loopback isn't the main point of AF_VSOCK, it is useful for testing and virtio-vsock must match VMCI semantics so that userspace programs run regardless of the underlying transport. My understanding is that loopback is not supported on the host side with VMCI. Follow that by implementing it only in the guest driver, not the vhost host driver. Cc: Jorgen Hansen <jhansen@vmware.com> Reported-by: Cathy Avery <cavery@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-24 11:53:15 -05:00
Liping Zhang	49cdc4c749	netfilter: nft_range: add the missing NULL pointer check Otherwise, kernel panic will happen if the user does not specify the related attributes. Fixes: 0f3cd9b36977 ("netfilter: nf_tables: add range expression") Signed-off-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-24 14:43:35 +01:00
Anders K. Pedersen	d3e2a1110c	netfilter: nf_tables: fix inconsistent element expiration calculation As Liping Zhang reports, after commit a8b1e36d0d1d ("netfilter: nft_dynset: fix element timeout for HZ != 1000"), priv->timeout was stored in jiffies, while set->timeout was stored in milliseconds. This is inconsistent and incorrect. Firstly, we already call msecs_to_jiffies in nft_set_elem_init, so priv->timeout will be converted to jiffies twice. Secondly, if the user did not specify the NFTA_DYNSET_TIMEOUT attr, set->timeout will be used, but we forget to call msecs_to_jiffies when do update elements. Fix this by using jiffies internally for traditional sets and doing the conversions to/from msec when interacting with userspace - as dynset already does. This is preferable to doing the conversions, when elements are inserted or updated, because this can happen very frequently on busy dynsets. Fixes: a8b1e36d0d1d ("netfilter: nft_dynset: fix element timeout for HZ != 1000") Reported-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Anders K. Pedersen <akp@cohaesio.com> Acked-by: Liping Zhang <zlpnobody@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-24 14:43:34 +01:00
Florian Westphal	7223ecd466	netfilter: nat: switch to new rhlist interface I got offlist bug report about failing connections and high cpu usage. This happens because we hit 'elasticity' checks in rhashtable that refuses bucket list exceeding 16 entries. The nat bysrc hash unfortunately needs to insert distinct objects that share same key and are identical (have same source tuple), this cannot be avoided. Switch to the rhlist interface which is designed for this. The nulls_base is removed here, I don't think its needed: A (unlikely) false positive results in unneeded port clash resolution, a false negative results in packet drop during conntrack confirmation, when we try to insert the duplicate into main conntrack hash table. Tested by adding multiple ip addresses to host, then adding iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE ... and then creating multiple connections, from same source port but different addresses: for i in $(seq 2000 2032);do nc -p 1234 192.168.7.1 $i > /dev/null & done (all of these then get hashed to same bysource slot) Then, to test that nat conflict resultion is working: nc -s 10.0.0.1 -p 1234 192.168.7.1 2000 nc -s 10.0.0.2 -p 1234 192.168.7.1 2000 tcp .. src=10.0.0.1 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1024 [ASSURED] tcp .. src=10.0.0.2 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1025 [ASSURED] tcp .. src=192.168.7.10 dst=192.168.7.1 sport=1234 dport=2000 src=192.168.7.1 dst=192.168.7.10 sport=2000 dport=1234 [ASSURED] tcp .. src=192.168.7.10 dst=192.168.7.1 sport=1234 dport=2001 src=192.168.7.1 dst=192.168.7.10 sport=2001 dport=1234 [ASSURED] [..] -> nat altered source ports to 1024 and 1025, respectively. This can also be confirmed on destination host which shows ESTAB 0 0 192.168.7.1:2000 192.168.7.10:1024 ESTAB 0 0 192.168.7.1:2000 192.168.7.10:1025 ESTAB 0 0 192.168.7.1:2000 192.168.7.10:1234 Cc: Herbert Xu <herbert@gondor.apana.org.au> Fixes: 870190a9ec907 ("netfilter: nat: convert nat bysrc hash to rhashtable") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-24 14:43:34 +01:00
Florian Westphal	728e87b496	netfilter: nat: fix cmp return value The comparator works like memcmp, i.e. 0 means objects are equal. In other words, when objects are distinct they are treated as identical, when they are distinct they are allegedly the same. The first case is rare (distinct objects are unlikely to get hashed to same bucket). The second case results in unneeded port conflict resolutions attempts. Fixes: 870190a9ec907 ("netfilter: nat: convert nat bysrc hash to rhashtable") Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-24 14:43:33 +01:00
Laura Garcia Liebana	abd66e9f3c	netfilter: nft_hash: validate maximum value of u32 netlink hash attribute Use the function nft_parse_u32_check() to fetch the value and validate the u32 attribute into the hash len u8 field. This patch revisits 4da449ae1df9 ("netfilter: nft_exthdr: Add size check on u8 nft_exthdr attributes"). Fixes: cb1b69b0b15b ("netfilter: nf_tables: add hash expression") Signed-off-by: Laura Garcia Liebana <nevola@gmail.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-24 14:40:03 +01:00
David Ahern	00b4422fe3	netfilter: Update nf_send_reset6 to consider L3 domain nf_send_reset6 is not considering the L3 domain and lookups are sent to the wrong table. For example consider the following output rule: ip6tables -A OUTPUT -p tcp --dport 12345 -j REJECT --reject-with tcp-reset using perf to analyze lookups via the fib6_table_lookup tracepoint shows: swapper 0 [001] 248.787816: fib6:fib6_table_lookup: table 255 oif 0 iif 1 src 2100:1::3 dst 2100:1: ffffffff81439cdc perf_trace_fib6_table_lookup ([kernel.kallsyms]) ffffffff814c1ce3 trace_fib6_table_lookup ([kernel.kallsyms]) ffffffff814c3e89 ip6_pol_route ([kernel.kallsyms]) ffffffff814c40d5 ip6_pol_route_output ([kernel.kallsyms]) ffffffff814e7b6f fib6_rule_action ([kernel.kallsyms]) ffffffff81437f60 fib_rules_lookup ([kernel.kallsyms]) ffffffff814e7c79 fib6_rule_lookup ([kernel.kallsyms]) ffffffff814c2541 ip6_route_output_flags ([kernel.kallsyms]) 528 nf_send_reset6 ([nf_reject_ipv6]) The lookup is directed to table 255 rather than the table associated with the device via the L3 domain. Update nf_send_reset6 to pull the L3 domain from the dst currently attached to the skb. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-24 12:47:08 +01:00
David Ahern	6d8b49c3a3	netfilter: Update ip_route_me_harder to consider L3 domain ip_route_me_harder is not considering the L3 domain and sending lookups to the wrong table. For example consider the following output rule: iptables -I OUTPUT -p tcp --dport 12345 -j REJECT --reject-with tcp-reset using perf to analyze lookups via the fib_table_lookup tracepoint shows: vrf-test 1187 [001] 46887.295927: fib:fib_table_lookup: table 255 oif 0 iif 0 src 0.0.0.0 dst 10.100.1.254 tos 0 scope 0 flags 0 ffffffff8143922c perf_trace_fib_table_lookup ([kernel.kallsyms]) ffffffff81493aac fib_table_lookup ([kernel.kallsyms]) ffffffff8148dda3 __inet_dev_addr_type ([kernel.kallsyms]) ffffffff8148ddf6 inet_addr_type ([kernel.kallsyms]) ffffffff8149e344 ip_route_me_harder ([kernel.kallsyms]) and vrf-test 1187 [001] 46887.295933: fib:fib_table_lookup: table 255 oif 0 iif 1 src 10.100.1.254 dst 10.100.1.2 tos 0 scope 0 flags ffffffff8143922c perf_trace_fib_table_lookup ([kernel.kallsyms]) ffffffff81493aac fib_table_lookup ([kernel.kallsyms]) ffffffff814998ff fib4_rule_action ([kernel.kallsyms]) ffffffff81437f35 fib_rules_lookup ([kernel.kallsyms]) ffffffff81499758 __fib_lookup ([kernel.kallsyms]) ffffffff8144f010 fib_lookup.constprop.34 ([kernel.kallsyms]) ffffffff8144f759 __ip_route_output_key_hash ([kernel.kallsyms]) ffffffff8144fc6a ip_route_output_flow ([kernel.kallsyms]) ffffffff8149e39b ip_route_me_harder ([kernel.kallsyms]) In both cases the lookups are directed to table 255 rather than the table associated with the device via the L3 domain. Update both lookups to pull the L3 domain from the dst currently attached to the skb. Signed-off-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2016-11-24 12:44:36 +01:00
WANG Cong	a4cd0271ea	net: revert "net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit" This reverts commit 7c6ae610a1f0, because l2tp_xmit_skb() never returns NET_XMIT_CN, it ignores the return value of l2tp_xmit_core(). Cc: Gao Feng <gfree.wind@gmail.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-23 20:18:36 -05:00
Zhang Shengju	93af205656	rtnetlink: fix the wrong minimal dump size getting from rtnl_calcit() For RT netlink, calcit() function should return the minimal size for netlink dump message. This will make sure that dump message for every network device can be stored. Currently, rtnl_calcit() function doesn't account the size of header of netlink message, this patch will fix it. Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-23 20:18:36 -05:00
Oliver Hartkopp	5499a6b22e	can: bcm: fix support for CAN FD frames Since commit 6f3b911d5f29b98 ("can: bcm: add support for CAN FD frames") the CAN broadcast manager supports CAN and CAN FD data frames. As these data frames are embedded in struct can[fd]_frames which have a different length the access to the provided array of CAN frames became dependend of op->cfsiz. By using a struct canfd_frame pointer for the array of CAN frames the new offset calculation based on op->cfsiz was accidently applied to CAN FD frame element lengths. This fix makes the pointer to the arrays of the different CAN frame types a void pointer so that the offset calculation in bytes accesses the correct CAN frame elements. Reference: http://marc.info/?l=linux-netdev&m=147980658909653 Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Tested-by: Andrey Konovalov <andreyknvl@google.com> Cc: linux-stable <stable@vger.kernel.org> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2016-11-23 15:22:18 +01:00
Miroslav Urbanek	6b22648781	flowcache: Increase threshold for refusing new allocations The threshold for OOM protection is too small for systems with large number of CPUs. Applications report ENOBUFs on connect() every 10 minutes. The problem is that the variable net->xfrm.flow_cache_gc_count is a global counter while the variable fc->high_watermark is a per-CPU constant. Take the number of CPUs into account as well. Fixes: 6ad3122a08e3 ("flowcache: Avoid OOM condition under preasure") Reported-by: Lukáš Koldrt <lk@excello.cz> Tested-by: Jan Hejl <jh@excello.cz> Signed-off-by: Miroslav Urbanek <mu@miroslavurbanek.com> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2016-11-23 06:37:09 +01:00
Johan Hedberg	39385cb5f3	Bluetooth: Fix using the correct source address type The hci_get_route() API is used to look up local HCI devices, however so far it has been incapable of dealing with anything else than the public address of HCI devices. This completely breaks with LE-only HCI devices that do not come with a public address, but use a static random address instead. This patch exteds the hci_get_route() API with a src_type parameter that's used for comparing with the right address of each HCI device. Signed-off-by: Johan Hedberg <johan.hedberg@intel.com> Signed-off-by: Marcel Holtmann <marcel@holtmann.org>	2016-11-22 22:50:46 +01:00
Eric Dumazet	c9b8af1330	flow_dissect: call init_default_flow_dissectors() earlier Andre Noll reported panics after my recent fix (commit 34fad54c2537 "net: __skb_flow_dissect() must cap its return value") After some more headaches, Alexander root caused the problem to init_default_flow_dissectors() being called too late, in case a network driver like IGB is not a module and receives DHCP message very early. Fix is to call init_default_flow_dissectors() much earlier, as it is a core infrastructure and does not depend on another kernel service. Fixes: 06635a35d13d4 ("flow_dissect: use programable dissector in skb_flow_dissect and friends") Signed-off-by: Eric Dumazet <edumazet@google.com> Reported-by: Andre Noll <maan@tuebingen.mpg.de> Diagnosed-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-22 14:44:01 -05:00
David S. Miller	f9aa9dc7d2	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net All conflicts were simple overlapping changes except perhaps for the Thunder driver. That driver has a change_mtu method explicitly for sending a message to the hardware. If that fails it returns an error. Normally a driver doesn't need an ndo_change_mtu method becuase those are usually just range changes, which are now handled generically. But since this extra operation is needed in the Thunder driver, it has to stay. However, if the message send fails we have to restore the original MTU before the change because the entire call chain expects that if an error is thrown by ndo_change_mtu then the MTU did not change. Therefore code is added to nicvf_change_mtu to remember the original MTU, and to restore it upon nicvf_update_hw_max_frs() failue. Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-22 13:27:16 -05:00
Linus Torvalds	27e7ab99db	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) Clear congestion control state when changing algorithms on an existing socket, from Florian Westphal. 2) Fix register bit values in altr_tse_pcs portion of stmmac driver, from Jia Jie Ho. 3) Fix PTP handling in stammc driver for GMAC4, from Giuseppe CAVALLARO. 4) Fix udplite multicast delivery handling, it ignores the udp_table parameter passed into the lookups, from Pablo Neira Ayuso. 5) Synchronize the space estimated by rtnl_vfinfo_size and the space actually used by rtnl_fill_vfinfo. From Sabrina Dubroca. 6) Fix memory leak in fib_info when splitting nodes, from Alexander Duyck. 7) If a driver does a napi_hash_del() explicitily and not via netif_napi_del(), it must perform RCU synchronization as needed. Fix this in virtio-net and bnxt drivers, from Eric Dumazet. 8) Likewise, it is not necessary to invoke napi_hash_del() is we are also doing neif_napi_del() in the same code path. Remove such calls from be2net and cxgb4 drivers, also from Eric Dumazet. 9) Don't allocate an ID in peernet2id_alloc() if the netns is dead, from WANG Cong. 10) Fix OF node and device struct leaks in of_mdio, from Johan Hovold. 11) We cannot cache routes in ip6_tunnel when using inherited traffic classes, from Paolo Abeni. 12) Fix several crashes and leaks in cpsw driver, from Johan Hovold. 13) Splice operations cannot use freezable blocking calls in AF_UNIX, from WANG Cong. 14) Link dump filtering by master device and kind support added an error in loop index updates during the dump if we actually do filter, fix from Zhang Shengju. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (59 commits) tcp: zero ca_priv area when switching cc algorithms net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit ethernet: stmmac: make DWMAC_STM32 depend on it's associated SoC tipc: eliminate obsolete socket locking policy description rtnl: fix the loop index update error in rtnl_dump_ifinfo() l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind() net: macb: add check for dma mapping error in start_xmit() rtnetlink: fix FDB size computation netns: fix get_net_ns_by_fd(int pid) typo af_unix: conditionally use freezable blocking calls in read net: ethernet: ti: cpsw: fix fixed-link phy probe deferral net: ethernet: ti: cpsw: add missing sanity check net: ethernet: ti: cpsw: fix secondary-emac probe error path net: ethernet: ti: cpsw: fix of_node and phydev leaks net: ethernet: ti: cpsw: fix deferred probe net: ethernet: ti: cpsw: fix mdio device reference leak net: ethernet: ti: cpsw: fix bad register access in probe error path net: sky2: Fix shutdown crash cfg80211: limit scan results cache size net sched filters: pass netlink message flags in event notification ...	2016-11-21 13:26:28 -08:00
Florian Westphal	e97991832a	tcp: make undo_cwnd mandatory for congestion modules The undo_cwnd fallback in the stack doubles cwnd based on ssthresh, which un-does reno halving behaviour. It seems more appropriate to let congctl algorithms pair .ssthresh and .undo_cwnd properly. Add a 'tcp_reno_undo_cwnd' function and wire it up for all congestion algorithms that used to rely on the fallback. Cc: Eric Dumazet <edumazet@google.com> Cc: Yuchung Cheng <ycheng@google.com> Cc: Neal Cardwell <ncardwell@google.com> Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-21 13:20:17 -05:00
Florian Westphal	85f7e7508a	tcp: add cwnd_undo functions to various tcp cc algorithms congestion control algorithms that do not halve cwnd in their .ssthresh should provide a .cwnd_undo rather than rely on current fallback which assumes reno halving (and thus doubles the cwnd). All of these do 'something else' in their .ssthresh implementation, thus store the cwnd on loss and provide .undo_cwnd to restore it again. A followup patch will remove the fallback and all algorithms will need to provide a .cwnd_undo function. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-21 13:20:17 -05:00
Nikolay Aleksandrov	aa2ae3e71c	bridge: mcast: add MLDv2 querier support This patch adds basic support for MLDv2 queries, the default is MLDv1 as before. A new multicast option - multicast_mld_version, adds the ability to change it between 1 and 2 via netlink and sysfs. The MLD option is disabled if CONFIG_IPV6 is disabled. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-21 13:16:58 -05:00
Nikolay Aleksandrov	5e9235853d	bridge: mcast: add IGMPv3 query support This patch adds basic support for IGMPv3 queries, the default is IGMPv2 as before. A new multicast option - multicast_igmp_version, adds the ability to change it between 2 and 3 via netlink and sysfs. The option struct member is in a 4 byte hole in net_bridge. There also a few minor style adjustments in br_multicast_new_group and br_multicast_add_group. Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-21 13:16:58 -05:00
Florian Westphal	7082c5c3f2	tcp: zero ca_priv area when switching cc algorithms We need to zero out the private data area when application switches connection to different algorithm (TCP_CONGESTION setsockopt). When congestion ops get assigned at connect time everything is already zeroed because sk_alloc uses GFP_ZERO flag. But in the setsockopt case this contains whatever previous cc placed there. Signed-off-by: Florian Westphal <fw@strlen.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-21 13:13:56 -05:00
Gao Feng	7c6ae610a1	net: l2tp: Treat NET_XMIT_CN as success in l2tp_eth_dev_xmit The tc could return NET_XMIT_CN as one congestion notification, but it does not mean the packe is lost. Other modules like ipvlan, macvlan, and others treat NET_XMIT_CN as success too. So l2tp_eth_dev_xmit should add the NET_XMIT_CN check. Signed-off-by: Gao Feng <gfree.wind@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-21 13:10:29 -05:00
Eric Dumazet	d21dbdfe0a	udp: avoid one cache line miss in recvmsg() UDP_SKB_CB(skb)->partial_cov is located at offset 66 in skb, requesting a cold cache line being read in cpu cache. We can avoid this cache line miss for UDP sockets, as partial_cov has a meaning only for UDPLite. Signed-off-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-21 11:26:59 -05:00
Jon Paul Maloy	51b9a31c42	tipc: eliminate obsolete socket locking policy description The comment block in socket.c describing the locking policy is obsolete, and does not reflect current reality. We remove it in this commit. Since the current locking policy is much simpler and follows a mainstream approach, we see no need to add a new description. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-19 22:15:41 -05:00
Zhang Shengju	3f0ae05d6f	rtnl: fix the loop index update error in rtnl_dump_ifinfo() If the link is filtered out, loop index should also be updated. If not, loop index will not be correct. Fixes: dc599f76c22b0 ("net: Add support for filtering link dump by master device and kind") Signed-off-by: Zhang Shengju <zhangshengju@cmss.chinamobile.com> Acked-by: David Ahern <dsa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-19 22:14:30 -05:00
Alexey Dobriyan	c72d8cdaa5	net: fix bogus cast in skb_pagelen() and use unsigned variables 1) cast to "int" is unnecessary: u8 will be promoted to int before decrementing, small positive numbers fit into "int", so their values won't be changed during promotion. Once everything is int including loop counters, signedness doesn't matter: 32-bit operations will stay 32-bit operations. But! Someone tried to make this loop smart by making everything of the same type apparently in an attempt to optimise it. Do the optimization, just differently. Do the cast where it matters. :^) 2) frag size is unsigned entity and sum of fragments sizes is also unsigned. Make everything unsigned, leave no MOVSX instruction behind. add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-4 (-4) function old new delta skb_cow_data 835 834 -1 ip_do_fragment 2549 2548 -1 ip6_fragment 3130 3128 -2 Total: Before=154865032, After=154865028, chg -0.00% Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-19 22:11:25 -05:00
Alexey Dobriyan	e0d7924a4a	net: make struct napi_alloc_cache::skb_count unsigned int size_t is way too much for an integer not exceeding 64. Space savings: 10 bytes! add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-10 (-10) function old new delta napi_consume_skb 165 163 -2 __kfree_skb_flush 56 53 -3 __kfree_skb_defer 97 92 -5 Total: Before=154865639, After=154865629, chg -0.00% Signed-off-by: Alexey Dobriyan <adobriyan@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-19 22:11:25 -05:00
Guillaume Nault	32c231164b	l2tp: fix racy SOCK_ZAPPED flag check in l2tp_ip{,6}_bind() Lock socket before checking the SOCK_ZAPPED flag in l2tp_ip6_bind(). Without lock, a concurrent call could modify the socket flags between the sock_flag(sk, SOCK_ZAPPED) test and the lock_sock() call. This way, a socket could be inserted twice in l2tp_ip6_bind_table. Releasing it would then leave a stale pointer there, generating use-after-free errors when walking through the list or modifying adjacent entries. BUG: KASAN: use-after-free in l2tp_ip6_close+0x22e/0x290 at addr ffff8800081b0ed8 Write of size 8 by task syz-executor/10987 CPU: 0 PID: 10987 Comm: syz-executor Not tainted 4.8.0+ #39 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014 ffff880031d97838 ffffffff829f835b ffff88001b5a1640 ffff8800081b0ec0 ffff8800081b15a0 ffff8800081b6d20 ffff880031d97860 ffffffff8174d3cc ffff880031d978f0 ffff8800081b0e80 ffff88001b5a1640 ffff880031d978e0 Call Trace: [<ffffffff829f835b>] dump_stack+0xb3/0x118 lib/dump_stack.c:15 [<ffffffff8174d3cc>] kasan_object_err+0x1c/0x70 mm/kasan/report.c:156 [< inline >] print_address_description mm/kasan/report.c:194 [<ffffffff8174d666>] kasan_report_error+0x1f6/0x4d0 mm/kasan/report.c:283 [< inline >] kasan_report mm/kasan/report.c:303 [<ffffffff8174db7e>] __asan_report_store8_noabort+0x3e/0x40 mm/kasan/report.c:329 [< inline >] __write_once_size ./include/linux/compiler.h:249 [< inline >] __hlist_del ./include/linux/list.h:622 [< inline >] hlist_del_init ./include/linux/list.h:637 [<ffffffff8579047e>] l2tp_ip6_close+0x22e/0x290 net/l2tp/l2tp_ip6.c:239 [<ffffffff850b2dfd>] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415 [<ffffffff851dc5a0>] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422 [<ffffffff84c4581d>] sock_release+0x8d/0x1d0 net/socket.c:570 [<ffffffff84c45976>] sock_close+0x16/0x20 net/socket.c:1017 [<ffffffff817a108c>] __fput+0x28c/0x780 fs/file_table.c:208 [<ffffffff817a1605>] ____fput+0x15/0x20 fs/file_table.c:244 [<ffffffff813774f9>] task_work_run+0xf9/0x170 [<ffffffff81324aae>] do_exit+0x85e/0x2a00 [<ffffffff81326dc8>] do_group_exit+0x108/0x330 [<ffffffff81348cf7>] get_signal+0x617/0x17a0 kernel/signal.c:2307 [<ffffffff811b49af>] do_signal+0x7f/0x18f0 [<ffffffff810039bf>] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156 [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190 [<ffffffff81006060>] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259 [<ffffffff85e4d726>] entry_SYSCALL_64_fastpath+0xc4/0xc6 Object at ffff8800081b0ec0, in cache L2TP/IPv6 size: 1448 Allocated: PID = 10987 [ 1116.897025] [<ffffffff811ddcb6>] save_stack_trace+0x16/0x20 [ 1116.897025] [<ffffffff8174c736>] save_stack+0x46/0xd0 [ 1116.897025] [<ffffffff8174c9ad>] kasan_kmalloc+0xad/0xe0 [ 1116.897025] [<ffffffff8174cee2>] kasan_slab_alloc+0x12/0x20 [ 1116.897025] [< inline >] slab_post_alloc_hook mm/slab.h:417 [ 1116.897025] [< inline >] slab_alloc_node mm/slub.c:2708 [ 1116.897025] [< inline >] slab_alloc mm/slub.c:2716 [ 1116.897025] [<ffffffff817476a8>] kmem_cache_alloc+0xc8/0x2b0 mm/slub.c:2721 [ 1116.897025] [<ffffffff84c4f6a9>] sk_prot_alloc+0x69/0x2b0 net/core/sock.c:1326 [ 1116.897025] [<ffffffff84c58ac8>] sk_alloc+0x38/0xae0 net/core/sock.c:1388 [ 1116.897025] [<ffffffff851ddf67>] inet6_create+0x2d7/0x1000 net/ipv6/af_inet6.c:182 [ 1116.897025] [<ffffffff84c4af7b>] __sock_create+0x37b/0x640 net/socket.c:1153 [ 1116.897025] [< inline >] sock_create net/socket.c:1193 [ 1116.897025] [< inline >] SYSC_socket net/socket.c:1223 [ 1116.897025] [<ffffffff84c4b46f>] SyS_socket+0xef/0x1b0 net/socket.c:1203 [ 1116.897025] [<ffffffff85e4d685>] entry_SYSCALL_64_fastpath+0x23/0xc6 Freed: PID = 10987 [ 1116.897025] [<ffffffff811ddcb6>] save_stack_trace+0x16/0x20 [ 1116.897025] [<ffffffff8174c736>] save_stack+0x46/0xd0 [ 1116.897025] [<ffffffff8174cf61>] kasan_slab_free+0x71/0xb0 [ 1116.897025] [< inline >] slab_free_hook mm/slub.c:1352 [ 1116.897025] [< inline >] slab_free_freelist_hook mm/slub.c:1374 [ 1116.897025] [< inline >] slab_free mm/slub.c:2951 [ 1116.897025] [<ffffffff81748b28>] kmem_cache_free+0xc8/0x330 mm/slub.c:2973 [ 1116.897025] [< inline >] sk_prot_free net/core/sock.c:1369 [ 1116.897025] [<ffffffff84c541eb>] __sk_destruct+0x32b/0x4f0 net/core/sock.c:1444 [ 1116.897025] [<ffffffff84c5aca4>] sk_destruct+0x44/0x80 net/core/sock.c:1452 [ 1116.897025] [<ffffffff84c5ad33>] __sk_free+0x53/0x220 net/core/sock.c:1460 [ 1116.897025] [<ffffffff84c5af23>] sk_free+0x23/0x30 net/core/sock.c:1471 [ 1116.897025] [<ffffffff84c5cb6c>] sk_common_release+0x28c/0x3e0 ./include/net/sock.h:1589 [ 1116.897025] [<ffffffff8579044e>] l2tp_ip6_close+0x1fe/0x290 net/l2tp/l2tp_ip6.c:243 [ 1116.897025] [<ffffffff850b2dfd>] inet_release+0xed/0x1c0 net/ipv4/af_inet.c:415 [ 1116.897025] [<ffffffff851dc5a0>] inet6_release+0x50/0x70 net/ipv6/af_inet6.c:422 [ 1116.897025] [<ffffffff84c4581d>] sock_release+0x8d/0x1d0 net/socket.c:570 [ 1116.897025] [<ffffffff84c45976>] sock_close+0x16/0x20 net/socket.c:1017 [ 1116.897025] [<ffffffff817a108c>] __fput+0x28c/0x780 fs/file_table.c:208 [ 1116.897025] [<ffffffff817a1605>] ____fput+0x15/0x20 fs/file_table.c:244 [ 1116.897025] [<ffffffff813774f9>] task_work_run+0xf9/0x170 [ 1116.897025] [<ffffffff81324aae>] do_exit+0x85e/0x2a00 [ 1116.897025] [<ffffffff81326dc8>] do_group_exit+0x108/0x330 [ 1116.897025] [<ffffffff81348cf7>] get_signal+0x617/0x17a0 kernel/signal.c:2307 [ 1116.897025] [<ffffffff811b49af>] do_signal+0x7f/0x18f0 [ 1116.897025] [<ffffffff810039bf>] exit_to_usermode_loop+0xbf/0x150 arch/x86/entry/common.c:156 [ 1116.897025] [< inline >] prepare_exit_to_usermode arch/x86/entry/common.c:190 [ 1116.897025] [<ffffffff81006060>] syscall_return_slowpath+0x1a0/0x1e0 arch/x86/entry/common.c:259 [ 1116.897025] [<ffffffff85e4d726>] entry_SYSCALL_64_fastpath+0xc4/0xc6 Memory state around the buggy address: ffff8800081b0d80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff8800081b0e00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc >ffff8800081b0e80: fc fc fc fc fc fc fc fc fb fb fb fb fb fb fb fb ^ ffff8800081b0f00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff8800081b0f80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ================================================================== The same issue exists with l2tp_ip_bind() and l2tp_ip_bind_table. Fixes: c51ce49735c1 ("l2tp: fix oops in L2TP IP sockets for connect() AF_UNSPEC case") Reported-by: Baozeng Ding <sploving1@gmail.com> Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Baozeng Ding <sploving1@gmail.com> Signed-off-by: Guillaume Nault <g.nault@alphalink.fr> Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-19 22:09:21 -05:00
David S. Miller	f463c99b20	This feature patchset includes the following changes: - 6 patches adding functionality to detect a WiFi interface under other virtual interfaces, like VLANs. They introduce a cache for the detected the WiFi configuration to avoid RTNL locking in critical sections. Patches have been prepared by Marek Lindner and Sven Eckelmann - Enable automatic module loading for genl requests, by Sven Eckelmann - Fix a potential race condition on interface removal. This is not happening very often in practice, but requires bigger changes to fix, so we are sending this to net-next. By Linus Luessing -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlgwVFYWHHN3QHNpbW9u d3VuZGVybGljaC5kZQAKCRChK+OYQpKeoZRaD/9YkdTsT8N629/C2yCrfvt2Zjav xj9+sGtmIq5xtSLkJaiQilz+ua8dCt99TbuzB8c48xXn9O41kejtv6kPE/YzYWVN dkLSO7cpJpT20hAAKD54iRv8m+Ed9ozgTPZd+Lu28fjwDc+FAzdmM1gAKx21Wtk6 SLmRWguA/ezN+FWWLv4HYtThWuOCVOpkhx8Zk7wuzT2PQbryXOIqQ5JOgaKqm1PE iHFhsleaHJ74qnr6UReIZ/g3h27+RPGvhXtAtfo+HEupW4FTZowGr7C6Sm9BCpmU yMQ1DGckNbg51hz+irJH7nGT+9y6UP/mKNduMOW37JcOF2YKyDvXr+A7C+3Nmv/0 F+AoFrDKp6vRBTgyKYvuL8zMvDn5mwCh/436/jIbqRvrCVGJQUY1IsS1yK+kPldy b/XamVCKAzxzVTumIDz5UCOAxMqaJmhLbasSoMqLZum4fuEU/CAZblKc/2lz/2h2 o4Jpka3aGwGSIB+vZC0cat1a3RYKesxKuUmEIU7ZTnySOpP8FoiEZuz2qQhhlfNm fdGnL0YydBO4yOBBmSoSmS64hfvfdwZv9yuXt2NABXJDSD6lfJKNT3MGDlB9phLM OVzO5tQqP9AWel2iFSXafqtuxdqGvv+eFv7PqLGNRqHEr691AIFw10qugx9g4Bul dqlpQ7CoVC16LrOaQw== =Pdlv -----END PGP SIGNATURE----- Merge tag 'batadv-next-for-davem-20161119' of git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This feature patchset includes the following changes: - 6 patches adding functionality to detect a WiFi interface under other virtual interfaces, like VLANs. They introduce a cache for the detected the WiFi configuration to avoid RTNL locking in critical sections. Patches have been prepared by Marek Lindner and Sven Eckelmann - Enable automatic module loading for genl requests, by Sven Eckelmann - Fix a potential race condition on interface removal. This is not happening very often in practice, but requires bigger changes to fix, so we are sending this to net-next. By Linus Luessing ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2016-11-19 11:13:05 -05:00

1 2 3 4 5 ...

44453 Commits