linux-next

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git synced 2025-01-12 08:48:48 +00:00

Author	SHA1	Message	Date
Ursula Braun	51f1de79ad	net/smc: replace sock_put worker by socket refcounting Proper socket refcounting makes the sock_put worker obsolete. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:41:56 -05:00
Ursula Braun	8dce2786a2	net/smc: smc_poll improvements Increase the socket refcount during poll wait. Take the socket lock before checking socket state. For a listening socket return a mask independent of state SMC_ACTIVE and cover errors or closed state as well. Get rid of the accept_q loop in smc_accept_poll(). Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:41:56 -05:00
Ursula Braun	da05bf2981	net/smc: handle device, port, and QP error events RoCE device changes cause an IB event, processed in the global event handler for the ROCE device. Problems for a certain Queue Pair cause a QP event, processed in the QP event handler for this QP. Among those events are port errors and other fatal device errors. All link groups using such a port or device must be terminated in those cases. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:41:56 -05:00
David S. Miller	a81e4affe1	Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next Steffen Klassert says: ==================== pull request (net-next): ipsec-next 2018-01-26 One last patch for this development cycle: 1) Add ESN support for IPSec HW offload. From Yossef Efraim. Please pull or let me know if there are problems. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:22:53 -05:00
David Ahern	fc1e64e109	net/ipv6: Add support for onlink flag Similar to IPv4 allow routes to be added with the RTNH_F_ONLINK flag. The onlink option requires a gateway and a nexthop device. Any unicast gateway is allowed (including IPv4 mapped addresses and unresolved ones) as long as the gateway is not a local address and if it resolves it must match the given device. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:16:43 -05:00
David Ahern	f4797b33db	net/ipv6: Add flags and table id to ip6_nh_lookup_table onlink verification needs to do a lookup in potentially different table than the table in fib6_config and without the RT6_LOOKUP_F_IFACE flag. Change ip6_nh_lookup_table to take table id and flags as input arguments. Both verifications want to ignore link state, so add that flag can stay in the lookup helper. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:16:42 -05:00
David Ahern	1edce99fa8	net/ipv6: Move gateway validation into helper Move existing code to validate nexthop into a helper. Follow on patch adds support for nexthops marked with onlink, and this helper keeps the complexity of ip6_route_info_create in check. Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-26 10:16:42 -05:00
David Ahern	9515a2e082	net/ipv4: Allow send to local broadcast from a socket bound to a VRF Message sends to the local broadcast address (255.255.255.255) require uc_index or sk_bound_dev_if to be set to an egress device. However, responses or only received if the socket is bound to the device. This is overly constraining for processes running in an L3 domain. This patch allows a socket bound to the VRF device to send to the local broadcast address by using IP_UNICAST_IF to set the egress interface with packet receipt handled by the VRF binding. Similar to IP_MULTICAST_IF, relax the constraint on setting IP_UNICAST_IF if a socket is bound to an L3 master device. In this case allow uc_index to be set to an enslaved if sk_bound_dev_if is an L3 master device and is the master device for the ifindex. In udp and raw sendmsg, allow uc_index to override the oif if uc_index master device is oif (ie., the oif is an L3 master and the index is an L3 slave). Signed-off-by: David Ahern <dsahern@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 21:51:31 -05:00
William Tu	fc1372f89f	openvswitch: add erspan version I and II support The patch adds support for openvswitch to configure erspan v1 and v2. The OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS attr is added to uapi as a binary blob to support all ERSPAN v1 and v2's fields. Note that Previous commit "openvswitch: Add erspan tunnel support." was reverted since it does not design properly. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 21:39:43 -05:00
William Tu	c69de58ba8	net: erspan: use bitfield instead of mask and offset Originally the erspan fields are defined as a group into a __be16 field, and use mask and offset to access each field. This is more costly due to calling ntohs/htons. The patch changes it to use bitfields. Signed-off-by: William Tu <u9012063@gmail.com> Acked-by: Pravin B Shelar <pshelar@ovn.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 21:39:43 -05:00
David Ahern	955ec4cb3b	net/ipv6: Do not allow route add with a device that is down IPv6 allows routes to be installed when the device is not up (admin up). Worse, it does not mark it as LINKDOWN. IPv4 does not allow it and really there is no reason for IPv6 to allow it, so check the flags and deny if device is admin down. Signed-off-by: David Ahern <dsahern@gmail.com> Reviewed-by: Ido Schimmel <idosch@mellanox.com> Reviewed-by: Roopa Prabhu <roopa@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 16:22:02 -05:00
Ursula Braun	1a0a04c7a8	net/smc: check for healthy link group resp. connections If a problem for at least one connection of a link group is detected, the whole link group and all its connections are terminated. This patch adds a check for healthy link group when trying to reserve a work request, and checks for healthy connections before starting a tx worker. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 16:10:42 -05:00
Ursula Braun	732720fafd	net/smc: wake up wr_reg_wait when terminating a link group If a new connection with a new rmb is added to a link group, its memory region is registered. If a link group is terminated, a pending registration requires a wake up. And consolidate setting of tx_flag peer_conn_abort in smc_lgr_terminate(). Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 16:10:42 -05:00
Ursula Braun	610db66f37	net/smc: do not reuse a linkgroup with setup problems Once a linkgroup is created successfully, it stays alive for a certain time to service more connections potentially created. If one of the initialization steps for a new linkgroup fails, the linkgroup should not be reused by other connections following. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 16:10:42 -05:00
Ursula Braun	b4772b3a87	net/smc: terminate link group for ib_post_send problems If ib_post_send() fails, terminate all connections of this link group. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 16:10:42 -05:00
Ursula Braun	5ac92a00aa	net/smc: handle state SMC_PEERFINCLOSEWAIT correctly A state transition from closing state SMC_PEERFINCLOSEWAIT to closing state SMC_APPFINCLOSEWAIT is not allowed. Once a closing indication from the peer has been received, the socket reaches state SMC_CLOSED. And receiving a peer_conn_abort just changes the state of the socket into one of the states SMC_PROCESSABORT or SMC_CLOSED; sending a peer_conn_abort occurs in smc_close_active() for state SMC_PROCESSABORT only. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 16:10:42 -05:00
Ursula Braun	611b63a127	net/smc: cancel tx worker in case of socket aborts If an SMC socket is aborted, the tx worker should be cancelled. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 16:10:42 -05:00
Kirill Tkhai	fb07a820fe	net: Move net:netns_ids destruction out of rtnl_lock() and document locking scheme Currently, we unhash a dying net from netns_ids lists under rtnl_lock(). It's a leftover from the time when net::netns_ids was introduced. There was no net::nsid_lock, and rtnl_lock() was mostly need to order modification of alive nets nsid idr, i.e. for: for_each_net(tmp) { ... id = __peernet2id(tmp, net); idr_remove(&tmp->netns_ids, id); ... } Since we have net::nsid_lock, the modifications are protected by this local lock, and now we may introduce better scheme of netns_ids destruction. Let's look at the functions peernet2id_alloc() and get_net_ns_by_id(). Previous commits taught these functions to work well with dying net acquired from rtnl unlocked lists. And they are the only functions which can hash a net to netns_ids or obtain from there. And as easy to check, other netns_ids operating functions works with id, not with net pointers. So, we do not need rtnl_lock to synchronize cleanup_net() with all them. The another property, which is used in the patch, is that net is unhashed from net_namespace_list in the only place and by the only process. So, we avoid excess rcu_read_lock() or rtnl_lock(), when we'are iterating over the list in unhash_nsid(). All the above makes possible to keep rtnl_lock() locked only for net->list deletion, and completely avoid it for netns_ids unhashing and destruction. As these two doings may take long time (e.g., memory allocation to send skb), the patch should positively act on the scalability and signify decrease the time, which rtnl_lock() is held in cleanup_net(). Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-25 11:15:35 -05:00
David S. Miller	8ec59b44a0	Merge branch 'rebased-net-ioctl' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 23:48:11 -05:00
David S. Miller	955bd1d216	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 23:44:15 -05:00
Al Viro	5c59e564e4	kill kernel_sock_ioctl() no users since 2014 Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-24 19:13:45 -05:00
Al Viro	44c02a2c3d	dev_ioctl(): move copyin/copyout to callers Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-24 19:13:45 -05:00
Al Viro	6a88fbe725	ipconfig: use dev_set_mtu() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-24 19:13:45 -05:00
Al Viro	b1b0c24506	lift handling of SIOCIW... out of dev_ioctl() Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-24 19:13:45 -05:00
Al Viro	4cf808e7ac	kill dev_ifname32() same story... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-24 19:13:45 -05:00
Al Viro	f92d4fc953	kill bond_ioctl() Same story as with dev_ifsioc(), except that the last cases with non-trivial conversions had been taken out in 2013... Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-24 19:13:45 -05:00
Al Viro	bf4405737f	kill dev_ifsioc() Once upon a time net/socket.c:dev_ifsioc() used to handle SIOCSHWTSTAMP and SIOCSIFMAP. These have different native and compat layout, so the format conversion had been needed. In 2009 these two cases had been taken out, turning the rest into a convoluted way to calling sock_do_ioctl(). We copy compat structure into native one, call sock_do_ioctl() on that and copy the result back for the in/out ioctls. No layout transformation anywhere, so we might as well just call sock_do_ioctl() and skip all the headache with copying. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-24 19:13:45 -05:00
Al Viro	ca25c30040	ip_rt_ioctl(): take copyin to caller Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-24 19:13:45 -05:00
Al Viro	03aef17bb7	devinet_ioctl(): take copyin/copyout to caller Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-24 19:13:45 -05:00
Al Viro	36fd633ec9	net: separate SIOCGIFCONF handling from dev_ioctl() Only two of dev_ioctl() callers may pass SIOCGIFCONF to it. Separating that codepath from the rest of dev_ioctl() allows both to simplify dev_ioctl() itself (all other cases work with struct ifreq ) and* seriously simplify the compat side of that beast: all it takes is passing to inet_gifconf() an extra argument - the size of individual records (sizeof(struct ifreq) or sizeof(struct compat_ifreq)). With dev_ifconf() called directly from sock_do_ioctl()/compat_dev_ifconf() that's easy to arrange. As the result, compat side of SIOCGIFCONF doesn't need any allocations, copy_in_user() back and forth, etc. Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2018-01-24 19:13:45 -05:00
Thomas Winter	5c38bd1b82	ip_tunnel: Use mark in skb by default This allows marks set by connmark in iptables to be used for route lookups. Signed-off-by: Thomas Winter <thomas.winter@alliedtelesis.co.nz> Cc: "David S. Miller" <davem@davemloft.net> Cc: Alexey Kuznetsov <kuznet@ms2.inr.ac.ru> Cc: Hideaki YOSHIFUJI <yoshfuji@linux-ipv6.org> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 16:30:23 -05:00
Jakub Kicinski	458e704d4d	cls_u32: propagate extack to delete callback Propagate extack on removal of offloaded filter. Don't pass extack from error paths. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 16:01:11 -05:00
Jakub Kicinski	f40fe58d13	cls_u32: pass offload flags to tc_cls_common_offload_init() Pass offload flags to the new implementation of tc_cls_common_offload_init(). Extack will now only be set if user requested skip_sw. hnodes need to hold onto the flags now to be able to reuse them on filter removal. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 16:01:11 -05:00
Jakub Kicinski	1b0f80375c	cls_flower: propagate extack to delete callback Propagate extack on removal of offloaded filter. Don't pass extack from error paths. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 16:01:10 -05:00
Jakub Kicinski	ea2059409c	cls_flower: pass offload flags to tc_cls_common_offload_init() Pass offload flags to the new implementation of tc_cls_common_offload_init(). Extack will now only be set if user requested skip_sw. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 16:01:10 -05:00
Jakub Kicinski	b505b29f68	cls_matchall: propagate extack to delete callback Propagate extack on removal of offloaded filter. Don't pass extack from error paths. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 16:01:10 -05:00
Jakub Kicinski	93da52b567	cls_matchall: pass offload flags to tc_cls_common_offload_init() Pass offload flags to the new implementation of tc_cls_common_offload_init(). Extack will now only be set if user requested skip_sw. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 16:01:10 -05:00
Jakub Kicinski	0e908a450a	cls_bpf: propagate extack to offload delete callback Propagate extack on removal of offloaded filter. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 16:01:10 -05:00
Jakub Kicinski	a6ffd6b5d6	cls_bpf: pass offload flags to tc_cls_common_offload_init() Pass offload flags to the new implementation of tc_cls_common_offload_init(). Extack will now only be set if user requested skip_sw. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 16:01:10 -05:00
Jakub Kicinski	f558fdea03	cls_bpf: remove gen_flags from bpf_offload cls_bpf now guarantees that only device-bound programs are allowed with skip_sw. The drivers no longer pay attention to flags on filter load, therefore the bpf_offload member can be removed. If flags are needed again they should probably be added to struct tc_cls_common_offload instead. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 16:01:10 -05:00
Jakub Kicinski	34832e1c70	net: sched: prepare for reimplementation of tc_cls_common_offload_init() Rename the tc_cls_common_offload_init() helper function to tc_cls_common_offload_init_deprecated() and add a new implementation which also takes flags argument. We will only set extack if flags indicate that offload is forced (skip_sw) otherwise driver errors should be ignored, as they don't influence the overall filter installation. Note that we need the tc_skip_hw() helper for new version, therefore it is added later in the file. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 16:01:10 -05:00
Jakub Kicinski	715df5ecab	net: sched: propagate extack to cls->destroy callbacks Propagate extack to cls->destroy callbacks when called from non-error paths. On error paths pass NULL to avoid overwriting the failure message. Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 16:01:09 -05:00
Tom Herbert	e557124023	kcm: Check if sk_user_data already set in kcm_attach This is needed to prevent sk_user_data being overwritten. The check is done under the callback lock. This should prevent a socket from being attached twice to a KCM mux. It also prevents a socket from being attached for other use cases of sk_user_data as long as the other cases set sk_user_data under the lock. Followup work is needed to unify all the use cases of sk_user_data to use the same locking. Reported-by: syzbot+114b15f2be420a8886c3@syzkaller.appspotmail.com Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Signed-off-by: Tom Herbert <tom@quantonium.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 15:54:30 -05:00
Tom Herbert	581e7226a5	kcm: Only allow TCP sockets to be attached to a KCM mux TCP sockets for IPv4 and IPv6 that are not listeners or in closed stated are allowed to be attached to a KCM mux. Fixes: ab7ac4eb9832 ("kcm: Kernel Connection Multiplexor module") Reported-by: syzbot+8865eaff7f9acd593945@syzkaller.appspotmail.com Signed-off-by: Tom Herbert <tom@quantonium.net> Reviewed-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 15:54:30 -05:00
Dmitry Safonov	52e12d5dae	pktgen: Clean read user supplied flag mess Don't use error-prone-brute-force way. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 15:03:36 -05:00
Dmitry Safonov	99c6d3d20d	pktgen: Remove brute-force printing of flags Add macro generated pkt_flag_names array, with a little help of which the flags can be printed by using an index. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 15:03:36 -05:00
Dmitry Safonov	6f107c7412	pktgen: Add behaviour flags macro to generate flags/names PKT_FALGS macro will be used to add package behavior names definitions to simplify the code that prints/reads pkg flags. Sorted the array in order of printing the flags in pktgen_if_show() Note: Renamed IPSEC_ON => IPSEC for simplicity. No visible behavior change expected. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 15:03:36 -05:00
Dmitry Safonov	57a5749b0f	pktgen: Add missing !flag parameters o FLOW_SEQ now can be disabled with pgset "flag !FLOW_SEQ" o FLOW_SEQ and FLOW_RND are antonyms, as it's shown by pktgen_if_show() o IPSEC now may be disabled Note, that IPV6 is enabled with dst6/src6 parameters, not with a flag parameter. Signed-off-by: Dmitry Safonov <dima@arista.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 15:03:36 -05:00
Wolfgang Bumiller	560a66075d	net: sched: em_nbyte: don't add the data offset twice 'ptr' is shifted by the offset and then validated, the memcmp should not add it a second time. Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 14:52:40 -05:00
Ursula Braun	aa377e682d	net/smc: continue waiting if peer signals write_shutdown If the peer sends a shutdown WRITE, this should not affect sending in general, and waiting for send buffer space in particular. Stop waiting of the local socket for send buffer space only, if peer signals closing, but not if peer signals just shutdown WRITE. Signed-off-by: Ursula Braun <ubraun@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2018-01-24 10:52:57 -05:00

1 2 3 4 5 ...

49697 Commits