linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-01-18 02:46:06 +00:00

Author	SHA1	Message	Date
David S. Miller	8f36e00065	Merge branch 's390-net-next' Julian Wiedmann says: ==================== s390/net: updates 2017-12-20 Please apply the following patch series for 4.16. Nothing too exciting, mostly just beating the qeth L3 code into shape. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:46 -05:00
Julian Wiedmann	556fd27186	s390/qeth: replace open-coded in*_pton() There's a common helper for parsing an IP address string, let's use it. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:46 -05:00
Julian Wiedmann	f6c131420a	s390/qeth: pass full data length to l3_fill_header() The TSO and IQD paths already need to fix-up the current values, and OSA will require more flexibility in the future as well. So just let the caller specify the data length. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:46 -05:00
Julian Wiedmann	910a0a8fe3	s390/qeth: streamline l3_fill_header() Consolidate the cast type translation, move the passthru path out of the RCU-guarded section, and use the appropriate rtable helpers when determining the next-hop address. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:46 -05:00
Julian Wiedmann	a843383a3e	s390/qeth: unionize next-hop field in qeth L3 header The L3 packet descriptor's 'dest_addr' field is used for a different purpose in RX descriptors. Clean up the hard-coded byte accesses and try to be more self-documenting. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:45 -05:00
Julian Wiedmann	a65d141043	s390/qeth: recognize non-IP multicast on L3 transmit When 1. an skb has no neighbour, and 2. skb->protocol is not IP[V6], we select the skb's cast type based on its destination MAC address. The multicast check is currently restricted to Multicast IP-mapped MACs. Extend it to also cover non-IP Multicast MACs. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:45 -05:00
Julian Wiedmann	1f9791235b	s390/qeth: clean up l3_get_cast_type() Use the proper helpers to check for multicast IP addressing, and remove some ancient Token Ring code. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:45 -05:00
Julian Wiedmann	19e36da61a	s390/qeth: robustify qeth_get_ip_version() Instead of assuming that skb->data points to the Ethernet header, use the right helper and struct to access the Ethertype field. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:45 -05:00
Julian Wiedmann	00c163f142	s390/qeth: align L2 and L3 set_rx_mode() implementations Once all of qeth_l3_set_rx_mode()'s single-use helpers are folded back in, the two implementations actually look quite similar. So improve the readability by converting both set_rx_mode() routines to a common format. This also allows us to walk ip_mc_htable just once, instead of three times. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:45 -05:00
Julian Wiedmann	99f0b85d5f	s390/qeth: use ether_addr_* helpers Be a little more self-documenting, and get rid of OSA_ADDR_LEN. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:45 -05:00
Julian Wiedmann	8174aa8ace	s390/qeth: consolidate qeth MAC address helpers For adding/removing a MAC address, use just one helper each that handles both unicast and multicast. Saves one level of indirection for multicast addresses, while improving the error reporting for unicast addresses. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:45 -05:00
Julian Wiedmann	4641b027f7	s390/qeth: don't keep track of MAC address's cast type Instead of tracking the uc/mc state in each MAC address object, just check the multicast bit in the address itself. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:45 -05:00
Julian Wiedmann	c062204f8d	s390/qeth: drop CONFIG_QETH_IPV6 commit "s390/qeth: use ip*_eth_mc_map helpers" removed the last occurrence of CONFIG_IPV6-dependent code. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:44 -05:00
Julian Wiedmann	8d9183d6eb	s390/qeth: use ip*_eth_mc_map helpers Get rid of some wrapper indirection, and stop accessing the skb at hard-coded offsets. Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:44 -05:00
Elena Reshetova	ae6959273a	qeth: convert qeth_reply.refcnt from atomic_t to refcount_t atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable qeth_reply.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> [jwi: removed the WARN_ONs. Use CONFIG_REFCOUNT_FULL if you care.] Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:44 -05:00
Elena Reshetova	c380cd5a00	net: convert lcs_reply.refcnt from atomic_t to refcount_t atomic_t variables are currently used to implement reference counters with the following properties: - counter is initialized to 1 using atomic_set() - a resource is freed upon counter reaching zero - once counter reaches zero, its further increments aren't allowed - counter schema uses basic atomic operations (set, inc, inc_not_zero, dec_and_test, etc.) Such atomic variables should be converted to a newly provided refcount_t type and API that prevents accidental counter overflows and underflows. This is important since overflows and underflows can lead to use-after-free situation and be exploitable. The variable lcs_reply.refcnt is used as pure reference counter. Convert it to refcount_t and fix up the operations. Suggested-by: Kees Cook <keescook@chromium.org> Reviewed-by: David Windsor <dwindsor@gmail.com> Reviewed-by: Hans Liljestrand <ishkamiel@gmail.com> Signed-off-by: Elena Reshetova <elena.reshetova@intel.com> [jwi: removed the WARN_ONs. Use CONFIG_REFCOUNT_FULL if you care.] Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:23:44 -05:00
Jens Axboe	0864fe09ab	null_blk: unalign call_single_data Commit 966a967116e6 randomly added alignment to this structure, but it's actually detrimental to performance of null_blk. Test case: Running on both the home and remote node shows a ~5% degradation in performance. While in there, move blk_status_t to the hole after the integer tag in the nullb_cmd structure. After this patch, we shrink the size from 192 to 152 bytes. Fixes: 966a967116e69 ("smp: Avoid using two cache lines for struct call_single_data") Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-12-20 13:16:33 -07:00
Jens Axboe	4ccafe0320	block: unalign call_single_data in struct request A previous change blindly added massive alignment to the call_single_data structure in struct request. This ballooned it in size from 296 to 320 bytes on my setup, for no valid reason at all. Use the unaligned struct __call_single_data variant instead. Fixes: 966a967116e69 ("smp: Avoid using two cache lines for struct call_single_data") Cc: stable@vger.kernel.org # v4.14 Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-12-20 13:16:33 -07:00
Ido Schimmel	b4681c2829	ipv4: Fix use-after-free when flushing FIB tables Since commit 0ddcf43d5d4a ("ipv4: FIB Local/MAIN table collapse") the local table uses the same trie allocated for the main table when custom rules are not in use. When a net namespace is dismantled, the main table is flushed and freed (via an RCU callback) before the local table. In case the callback is invoked before the local table is iterated, a use-after-free can occur. Fix this by iterating over the FIB tables in reverse order, so that the main table is always freed after the local table. v3: Reworded comment according to Alex's suggestion. v2: Add a comment to make the fix more explicit per Dave's and Alex's feedback. Fixes: 0ddcf43d5d4a ("ipv4: FIB Local/MAIN table collapse") Signed-off-by: Ido Schimmel <idosch@mellanox.com> Reported-by: Fengguang Wu <fengguang.wu@intel.com> Acked-by: Alexander Duyck <alexander.h.duyck@intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:12:39 -05:00
Julian Wiedmann	ad3cbf6133	s390/qeth: fix error handling in checksum cmd callback Make sure to check both return code fields before processing the response. Otherwise we risk operating on invalid data. Fixes: c9475369bd2b ("s390/qeth: rework RX/TX checksum offload") Signed-off-by: Julian Wiedmann <jwi@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 15:11:49 -05:00
Jon Maloy	bb25c3855a	tipc: remove joining group member from congested list When we receive a JOIN message from a peer member, the message may contain an advertised window value ADV_IDLE that permits removing the member in question from the tipc_group::congested list. However, since the removal has been made conditional on that the advertised window is not ADV_IDLE, we miss this case. This has the effect that a sender sometimes may enter a state of permanent, false, broadcast congestion. We fix this by unconditinally removing the member from the congested list before calling tipc_member_update(), which might potentially sort it into the list again. Signed-off-by: Jon Maloy <jon.maloy@ericsson.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 14:56:48 -05:00
David S. Miller	a943e8bc05	This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - de-inline hash functions to save memory footprint, by Denys Vlasenko - Add License information to various files, by Sven Eckelmann (3 patches) - Change batman_adv.h from ISC to MIT, by Sven Eckelmann - Improve various includes, by Sven Eckelmann (5 patches) - Lots of kernel-doc work by Sven Eckelmann (8 patches) -----BEGIN PGP SIGNATURE----- iQJKBAABCgA0FiEE1ilQI7G+y+fdhnrfoSvjmEKSnqEFAlo6QfgWHHN3QHNpbW9u d3VuZGVybGljaC5kZQAKCRChK+OYQpKeobWfEADPEOdxWS7nW4Xkhug+7vLbcloJ Om1VDKFD4n5NfB6e+vh8kQAnnnQ/LzFSv53giNdnjjE9IPNKxNhzBQFS95H189EP ebP0mKOesTadkTx+MjFxenhnaTnzK0hkngdxz/frvaq+i6ECMhnq8Bw0elVG0nSg X9ts0x6BuNIw6EjIdPP0GvfOV1DvUmdMz1YLJy4yoJ8Tm671y86y07jTJaaEN2Ex 9dp0j3hEqtrYZvgRdQ/hzYLFJ9fvcF1FyA+duufK3FNbkJtZn89zqseIuN2saqqw QN/nDBrduzW6SR9y0JfXlatI6FN6316jskLcpqorz5/88KrwQeTOg2ZXXn0iw6l1 r0tkBP5/eEu2Dcd3WRsNtTnMZmGfc2uuqmvD1Pz0wN7RAzIYhPvs9to6TVy3mK5p 0OyFJaZU9vurtIPqVYjNtSofkUHKMOZZ1H7LocWaINGIVsREx/i0DmX//M0ZiqbB mS4ybkn8yOYfFUizg51RYo2nKVyw/ZXNqBYBPUWio91CPj0vOUFitvfxnxNitV/m 182HVdoOnGhorYbS3J/8Su3AyEyhJTVWnmq0z3u1CuvdMhppbAzvXvmERotAJN9e 5Wp26PE2a7zb4LJhMBQ4q+RnUnZV5ADhigrIEGPOQoY5KdIrEOcW65wrgfhWYlER NVJc2okVZKhg3o7amA== =5JUK -----END PGP SIGNATURE----- Merge tag 'batadv-next-for-davem-20171220' of git://git.open-mesh.org/linux-merge Simon Wunderlich says: ==================== This feature/cleanup patchset includes the following patches: - bump version strings, by Simon Wunderlich - de-inline hash functions to save memory footprint, by Denys Vlasenko - Add License information to various files, by Sven Eckelmann (3 patches) - Change batman_adv.h from ISC to MIT, by Sven Eckelmann - Improve various includes, by Sven Eckelmann (5 patches) - Lots of kernel-doc work by Sven Eckelmann (8 patches) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 14:33:03 -05:00
Naresh Kamboju	1c8e77fb36	selftests: net: Adding config fragment CONFIG_NUMA=y kernel config fragement CONFIG_NUMA=y is need for reuseport_bpf_numa. Signed-off-by: Naresh Kamboju <naresh.kamboju@linaro.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 14:25:34 -05:00
David S. Miller	b2597f78d4	Merge branch 'replace-tcp_set_state-tracepoint-with-inet_sock_set_state' Yafang Shao says: ==================== replace tcp_set_state tracepoint with inet_sock_set_state According to the discussion in the mail thread https://patchwork.kernel.org/patch/10099243/, tcp_set_state tracepoint is renamed to inet_sock_set_state tracepoint and is moved to include/trace/events/sock.h. With this new tracepoint, we can trace AF_INET/AF_INET6 sock state transitions. As there's only one single tracepoint for inet, so I didn't create a new trace file named trace/events/inet_sock.h, and just place it in include/trace/events/sock.h Currently TCP/DCCP/SCTP state transitions are traced with this tracepoint. - Why not more protocol ? If we really think that anonter protocol should be traced, I will modify the code to trace it. I just want to make the code easy and not output useless information. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 14:00:26 -05:00
Yafang Shao	cbabf46364	net: tracepoint: using sock_set_state tracepoint to trace SCTP state transition With changes in inet_ files, SCTP state transitions are traced with inet_sock_set_state tracepoint. As SCTP state names, i.e. SCTP_SS_CLOSED, SCTP_SS_ESTABLISHED, have the same value with TCP state names. So the output info still print the TCP state names, that makes the code easy. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 14:00:25 -05:00
Yafang Shao	b0832e3005	net: tracepoint: using sock_set_state tracepoint to trace DCCP state transition With changes in inet_ files, DCCP state transitions are traced with inet_sock_set_state tracepoint. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 14:00:25 -05:00
Yafang Shao	986ffdfd08	net: sock: replace sk_state_load with inet_sk_state_load and remove sk_state_store sk_state_load is only used by AF_INET/AF_INET6, so rename it to inet_sk_state_load and move it into inet_sock.h. sk_state_store is removed as it is not used any more. Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 14:00:25 -05:00
Yafang Shao	563e0bb0dc	net: tracepoint: replace tcp_set_state tracepoint with inet_sock_set_state tracepoint As sk_state is a common field for struct sock, so the state transition tracepoint should not be a TCP specific feature. Currently it traces all AF_INET state transition, so I rename this tracepoint to inet_sock_set_state tracepoint with some minor changes and move it into trace/events/sock.h. We dont need to create a file named trace/events/inet_sock.h for this one single tracepoint. Two helpers are introduced to trace sk_state transition - void inet_sk_state_store(struct sock sk, int newstate); - void inet_sk_set_state(struct sock sk, int state); As trace header should not be included in other header files, so they are defined in sock.c. The protocol such as SCTP maybe compiled as a ko, hence export inet_sk_set_state(). Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 14:00:25 -05:00
Steven Rostedt (VMware)	d7b850a7de	tcp: Export to userspace the TCP state names for the trace events The TCP trace events (specifically tcp_set_state), maps emums to symbol names via __print_symbolic(). But this only works for reading trace events from the tracefs trace files. If perf or trace-cmd were to record these events, the event format file does not convert the enum names into numbers, and you get something like: __print_symbolic(REC->oldstate, { TCP_ESTABLISHED, "TCP_ESTABLISHED" }, { TCP_SYN_SENT, "TCP_SYN_SENT" }, { TCP_SYN_RECV, "TCP_SYN_RECV" }, { TCP_FIN_WAIT1, "TCP_FIN_WAIT1" }, { TCP_FIN_WAIT2, "TCP_FIN_WAIT2" }, { TCP_TIME_WAIT, "TCP_TIME_WAIT" }, { TCP_CLOSE, "TCP_CLOSE" }, { TCP_CLOSE_WAIT, "TCP_CLOSE_WAIT" }, { TCP_LAST_ACK, "TCP_LAST_ACK" }, { TCP_LISTEN, "TCP_LISTEN" }, { TCP_CLOSING, "TCP_CLOSING" }, { TCP_NEW_SYN_RECV, "TCP_NEW_SYN_RECV" }) Where trace-cmd and perf do not know the values of those enums. Use the TRACE_DEFINE_ENUM() macros that will have the trace events convert the enum strings into their values at system boot. This will allow perf and trace-cmd to see actual numbers and not enums: __print_symbolic(REC->oldstate, { 1, "TCP_ESTABLISHED" }, { 2, "TCP_SYN_SENT" }, { 3, "TCP_SYN_RECV" }, { 4, "TCP_FIN_WAIT1" }, { 5, "TCP_FIN_WAIT2" }, { 6, "TCP_TIME_WAIT" }, { 7, "TCP_CLOSE" }, { 8, "TCP_CLOSE_WAIT" }, { 9, "TCP_LAST_ACK" }, { 10, "TCP_LISTEN" }, { 11, "TCP_CLOSING" }, { 12, "TCP_NEW_SYN_RECV" }) Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org> Acked-by: Song Liu <songliubraving@fb.com> Signed-off-by: Yafang Shao <laoar.shao@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 14:00:24 -05:00
Prashant Bhole	9ee1942cb3	netdevsim: correctly check return value of debugfs_create_dir - Checking return value with IS_ERROR_OR_NULL - Added error handling where it was not handled Signed-off-by: Prashant Bhole <bhole_prashant_q7@lab.ntt.co.jp> Acked-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:58:48 -05:00
Haishuang Yan	afb4c97d90	ip6_gre: fix potential memory leak in ip6erspan_rcv If md is NULL, tun_dst must be freed, otherwise it will cause memory leak. Fixes: ef7baf5e083c ("ip6_gre: add ip6 erspan collect_md mode") Cc: William Tu <u9012063@gmail.com> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:56:39 -05:00
Haishuang Yan	50670b6ee9	ip_gre: fix potential memory leak in erspan_rcv If md is NULL, tun_dst must be freed, otherwise it will cause memory leak. Fixes: 1a66a836da6 ("gre: add collect_md mode to ERSPAN tunnel") Cc: William Tu <u9012063@gmail.com> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:56:39 -05:00
Haishuang Yan	a7343211f0	ip6_gre: fix error path when ip6erspan_rcv failed Same as ipv4 code, when ip6erspan_rcv call return PACKET_REJECT, we should call icmpv6_send to send icmp unreachable message in error path. Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support") Acked-by: William Tu <u9012063@gmail.com> Cc: William Tu <u9012063@gmail.com> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:51:46 -05:00
Haishuang Yan	dd8d5b8c5b	ip_gre: fix error path when erspan_rcv failed When erspan_rcv call return PACKET_REJECT, we shoudn't call ipgre_rcv to process packets again, instead send icmp unreachable message in error path. Fixes: 84e54fe0a5ea ("gre: introduce native tunnel support for ERSPAN") Acked-by: William Tu <u9012063@gmail.com> Cc: William Tu <u9012063@gmail.com> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:51:46 -05:00
Haishuang Yan	293a1991cf	ip6_gre: fix a pontential issue in ip6erspan_rcv pskb_may_pull() can change skb->data, so we need to load ipv6h/ershdr at the right place. Fixes: 5a963eb61b7c ("ip6_gre: Add ERSPAN native tunnel support") Cc: William Tu <u9012063@gmail.com> Acked-by: William Tu <u9012063@gmail.com> Signed-off-by: Haishuang Yan <yanhaishuang@cmss.chinamobile.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:48:39 -05:00
David S. Miller	932f8c77a9	mlx5-fixes-2017-12-19 Misc fixes for mlx5 core and mlx5 netdev driver. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQEbBAABAgAGBQJaOYOtAAoJEEg/ir3gV/o+1/0H+I9yY/HYQIU0Cl08yNvYKBcS KuhGpeJCH20rPQPrcPTG3zN0KF6DZKjwsQOwxdE5pUXqQNUuyuogUZCuLYB4OElL P4b1o5G377sTcDQ7jH2YAWD5hO8c5vKxsDbvN+kJaUkK+SHvjhLdvC5teUPf58rx tlDcWzdp3w1nBWeoLbASXTiqPYyA8vGkjOWWiQxBZoD4A4U7QpRKsEKaWd6g3mtH mnKVd8sczIFCHoHCQ3e+efJMz478YvX2rzdKZ8eycAMkQHBIny0fZzc4IiFy5ZXN L2CUGesr8x1CZ9dtK+OEw1STpalD0kpCfjRhYd2B7X0KgY/FN+7vgnpmBtxdzA== =G//q -----END PGP SIGNATURE----- Merge tag 'mlx5-fixes-2017-12-19' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux Saeed Mahameed says: =================== Mellanox, mlx5 fixes 2017-12-19 The follwoing series includes some fixes for mlx5 core and etherent driver. Please pull and let me know if there is any problem. This series doesn't introduce any conflict with the ongoing mlx5 for-next submission. For -stable: kernels >= v4.7.y ("net/mlx5e: Fix possible deadlock of VXLAN lock") ("net/mlx5e: Add refcount to VXLAN structure") ("net/mlx5e: Prevent possible races in VXLAN control flow") ("net/mlx5e: Fix features check of IPv6 traffic") kernels >= v4.9.y ("net/mlx5: Fix error flow in CREATE_QP command") ("net/mlx5: Fix rate limit packet pacing naming and struct") kernels >= v4.13.y ("net/mlx5: FPGA, return -EINVAL if size is zero") kernels >= v4.14.y ("Revert "mlx5: move affinity hints assignments to generic code") All above patches apply and compile with no issues on corresponding -stable. =================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:41:05 -05:00
Shaohua Li	111be88398	block-throttle: avoid double charge If a bio is throttled and split after throttling, the bio could be resubmited and enters the throttling again. This will cause part of the bio to be charged multiple times. If the cgroup has an IO limit, the double charge will significantly harm the performance. The bio split becomes quite common after arbitrary bio size change. To fix this, we always set the BIO_THROTTLED flag if a bio is throttled. If the bio is cloned/split, we copy the flag to new bio too to avoid a double charge. However, cloned bio could be directed to a new disk, keeping the flag be a problem. The observation is we always set new disk for the bio in this case, so we can clear the flag in bio_set_dev(). This issue exists for a long time, arbitrary bio size change just makes it worse, so this should go into stable at least since v4.2. V1-> V2: Not add extra field in bio based on discussion with Tejun Cc: Vivek Goyal <vgoyal@redhat.com> Cc: stable@vger.kernel.org Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Shaohua Li <shli@fb.com> Signed-off-by: Jens Axboe <axboe@kernel.dk>	2017-12-20 11:10:17 -07:00
David S. Miller	a8fcefe88b	Merge branch 'cls_bpf-fix-offload-state-tracking-with-block-callbacks' Jakub Kicinski says: =================== cls_bpf: fix offload state tracking with block callbacks After introduction of block callbacks classifiers can no longer track offload state. cls_bpf used to do that in an attempt to move common code from drivers to the core. Remove that functionality and fix drivers. The user-visible bug this is fixing is that trying to offload a second filter would trigger a spurious DESTROY and in turn disable the already installed one. =================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:08:19 -05:00
Jakub Kicinski	d3f89b98e3	nfp: bpf: keep track of the offloaded program After TC offloads were converted to callbacks we have no choice but keep track of the offloaded filter in the driver. The check for nn->dp.bpf_offload_xdp was a stop gap solution to make sure failed TC offload won't disable XDP, it's no longer necessary. nfp_net_bpf_offload() will return -EBUSY on TC vs XDP conflicts. Fixes: 3f7889c4c79b ("net: sched: cls_bpf: call block callbacks for offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:08:18 -05:00
Jakub Kicinski	102740bd94	cls_bpf: fix offload assumptions after callback conversion cls_bpf used to take care of tracking what offload state a filter is in, i.e. it would track if offload request succeeded or not. This information would then be used to issue correct requests to the driver, e.g. requests for statistics only on offloaded filters, removing only filters which were offloaded, using add instead of replace if previous filter was not added etc. This tracking of offload state no longer functions with the new callback infrastructure. There could be multiple entities trying to offload the same filter. Throw out all the tracking and corresponding commands and simply pass to the drivers both old and new bpf program. Drivers will have to deal with offload state tracking by themselves. Fixes: 3f7889c4c79b ("net: sched: cls_bpf: call block callbacks for offload") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Acked-by: Jiri Pirko <jiri@mellanox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:08:18 -05:00
Andy Shevchenko	9a07ae6893	net: amd-xgbe: Get rid of custom hex_dump_to_buffer() Get rid of yet another custom hex_dump_to_buffer(). The output is slightly changed, i.e. each byte followed by white space. Note, we don't use print_hex_dump() here since the original code uses nedev_dbg(). Acked-by: Tom Lendacky <thomas.lendacky@amd.com> Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:04:45 -05:00
Michael Chan	97bbf6623e	net: Clarify dev_weight documentation for LRO and GRO_HW. Reported-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com> Signed-off-by: Michael Chan <michael.chan@broadcom.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 13:04:01 -05:00
David S. Miller	e200f7009b	Merge branch 'netdevsim-couple-of-build-warning-fixes' Jakub Kicinski says: ==================== netdevsim: couple of build warning fixes This series fixes two harmless build warning about a symbol which should be static and an unused variable. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 12:51:11 -05:00
Jakub Kicinski	40946e93b3	netdevsim: bpf: remove unused variable skip_sw is set but no longer used. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 12:51:11 -05:00
Jakub Kicinski	fd5ebbc75c	netdevsim: declare struct device_type as static struct device_type nsim_dev_type created for SR-IOV support should be static. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Quentin Monnet <quentin.monnet@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 12:51:10 -05:00
William Tu	5d0c138eff	selftests: rtnetlink: add gretap test cases Add test cases for gretap and ip6gretap, native mode and external (collect metadata) mode. Signed-off-by: William Tu <u9012063@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 12:48:38 -05:00
Andy Shevchenko	143337c9e1	net: pasemi: Replace mac address parsing Replace sscanf() with mac_pton(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 12:47:46 -05:00
Andy Shevchenko	ce5c144f48	net: bonding: Replace mac address parsing Replace sscanf() with mac_pton(). Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Acked-by: Andy Gospodarek <andy@greyhouse.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 12:47:29 -05:00
Andy Shevchenko	223b229b63	bridge: Use helpers to handle MAC address Use %pM to print MAC mac_pton() to convert it from ASCII to binary format, and ether_addr_copy() to copy. Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com> Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 12:46:11 -05:00
Eric W. Biederman	21b5944350	net: Fix double free and memory corruption in get_net_ns_by_id() (I can trivially verify that that idr_remove in cleanup_net happens after the network namespace count has dropped to zero --EWB) Function get_net_ns_by_id() does not check for net::count after it has found a peer in netns_ids idr. It may dereference a peer, after its count has already been finaly decremented. This leads to double free and memory corruption: put_net(peer) rtnl_lock() atomic_dec_and_test(&peer->count) [count=0] ... __put_net(peer) get_net_ns_by_id(net, id) spin_lock(&cleanup_list_lock) list_add(&net->cleanup_list, &cleanup_list) spin_unlock(&cleanup_list_lock) queue_work() peer = idr_find(&net->netns_ids, id) \| get_net(peer) [count=1] \| ... \| (use after final put) v ... cleanup_net() ... spin_lock(&cleanup_list_lock) ... list_replace_init(&cleanup_list, ..) ... spin_unlock(&cleanup_list_lock) ... ... ... ... put_net(peer) ... atomic_dec_and_test(&peer->count) [count=0] ... spin_lock(&cleanup_list_lock) ... list_add(&net->cleanup_list, &cleanup_list) ... spin_unlock(&cleanup_list_lock) ... queue_work() ... rtnl_unlock() rtnl_lock() ... for_each_net(tmp) { ... id = __peernet2id(tmp, peer) ... spin_lock_irq(&tmp->nsid_lock) ... idr_remove(&tmp->netns_ids, id) ... ... ... net_drop_ns() ... net_free(peer) ... } ... \| v cleanup_net() ... (Second free of peer) Also, put_net() on the right cpu may reorder with left's cpu list_replace_init(&cleanup_list, ..), and then cleanup_list will be corrupted. Since cleanup_net() is executed in worker thread, while put_net(peer) can happen everywhere, there should be enough time for concurrent get_net_ns_by_id() to pick the peer up, and the race does not seem to be unlikely. The patch fixes the problem in standard way. (Also, there is possible problem in peernet2id_alloc(), which requires check for net::count under nsid_lock and maybe_get_net(peer), but in current stable kernel it's used under rtnl_lock() and it has to be safe. Openswitch begun to use peernet2id_alloc(), and possibly it should be fixed too. While this is not in stable kernel yet, so I'll send a separate message to netdev@ later). Cc: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Fixes: 0c7aecd4bde4 "netns: add rtnl cmd to add and get peer netns ids" Reviewed-by: Andrey Ryabinin <aryabinin@virtuozzo.com> Reviewed-by: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Eric W. Biederman <ebiederm@xmission.com> Reviewed-by: Eric Dumazet <edumazet@google.com> Acked-by: Nicolas Dichtel <nicolas.dichtel@6wind.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-12-20 12:42:22 -05:00

1 2 3 4 5 ...

724043 Commits