linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-01-13 00:29:50 +00:00

Author	SHA1	Message	Date
Xin Long	68913a018f	netfilter: ipvs: do not create conn for ABORT packet in sctp_conn_schedule There's no reason for ipvs to create a conn for an ABORT packet even if sysctl_sloppy_sctp is set. This patch is to accept it without creating a conn, just as ipvs does for tcp's RST packet. Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-09-08 13:40:23 +02:00
Xin Long	1cc4a01866	netfilter: ipvs: fix the issue that sctp_conn_schedule drops non-INIT packet Commit 5e26b1b3abce ("ipvs: support scheduling inverse and icmp SCTP packets") changed to check packet type early. It introduced a side effect: if it's not a INIT packet, ports will be set as NULL, and the packet will be dropped later. It caused that sctp couldn't create connection when ipvs module is loaded and any scheduler is registered on server. Li Shuang reproduced it by running the cmds on sctp server: # ipvsadm -A -t 1.1.1.1:80 -s rr # ipvsadm -D -t 1.1.1.1:80 then the server could't work any more. This patch is to return 1 when it's not an INIT packet. It means ipvs will accept it without creating a conn for it, just like what it does for tcp. Fixes: 5e26b1b3abce ("ipvs: support scheduling inverse and icmp SCTP packets") Reported-by: Li Shuang <shuali@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: Simon Horman <horms@verge.net.au> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-09-08 13:40:02 +02:00
Håkon Bugge	126f760ca9	rds: Fix incorrect statistics counting In rds_send_xmit() there is logic to batch the sends. However, if another thread has acquired the lock and has incremented the send_gen, it is considered a race and we yield. The code incrementing the s_send_lock_queue_raced statistics counter did not count this event correctly. This commit counts the race condition correctly. Changes from v1: - Removed check for someone_on_xmit() - Fixed incorrect indentation Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Knut Omang <knut.omang@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-07 20:07:13 -07:00
Paolo Abeni	ca2c1418ef	udp: drop head states only when all skb references are gone After commit 0ddf3fb2c43d ("udp: preserve skb->dst if required for IP options processing") we clear the skb head state as soon as the skb carrying them is first processed. Since the same skb can be processed several times when MSG_PEEK is used, we can end up lacking the required head states, and eventually oopsing. Fix this clearing the skb head state only when processing the last skb reference. Reported-by: Eric Dumazet <edumazet@google.com> Fixes: 0ddf3fb2c43d ("udp: preserve skb->dst if required for IP options processing") Signed-off-by: Paolo Abeni <pabeni@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-07 20:02:39 -07:00
Xin Long	5c25f30c93	ip6_gre: update mtu properly in ip6gre_err Now when probessing ICMPV6_PKT_TOOBIG, ip6gre_err only subtracts the offset of gre header from mtu info. The expected mtu of gre device should also subtract gre header. Otherwise, the next packets still can't be sent out. Jianlin found this issue when using the topo: client(ip6gre)<---->(nic1)route(nic2)<----->(ip6gre)server and reducing nic2's mtu, then both tcp and sctp's performance with big size data became 0. This patch is to fix it by also subtracting grehdr (tun->tun_hlen) from mtu info when updating gre device's mtu in ip6gre_err(). It also needs to subtract ETH_HLEN if gre dev'type is ARPHRD_ETHER. Reported-by: Jianlin Shi <jishi@redhat.com> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-07 19:59:47 -07:00
Jiri Pirko	80532384af	net: sched: fix memleak for chain zero There's a memleak happening for chain 0. The thing is, chain 0 needs to be always present, not created on demand. Therefore tcf_block_get upon creation of block calls the tcf_chain_create function directly. The chain is created with refcnt == 1, which is not correct in this case and causes the memleak. So move the refcnt increment into tcf_chain_get function even for the case when chain needs to be created. Reported-by: Jakub Kicinski <kubakici@wp.pl> Fixes: 5bc1701881e3 ("net: sched: introduce multichain support for filters") Signed-off-by: Jiri Pirko <jiri@mellanox.com> Tested-by: Jakub Kicinski <jakub.kicinski@netronome.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-07 19:17:20 -07:00
David S. Miller	0f2be423f1	Back from a long absence, so we have a number of things: * a remain-on-channel fix from Avi * hwsim TX power fix from Beni * null-PTR dereference with iTXQ in some rare configurations (Chunho) * 40 MHz custom regdomain fixes (Emmanuel) * look at right place in HT/VHT capability parsing (Igor) * complete A-MPDU teardown properly (Ilan) * Mesh ID Element ordering fix (Liad) * avoid tracing warning in ht_dbg() (Sharon) * fix print of assoc/reassoc (Simon) * fix encrypted VLAN with iTXQ (myself) * fix calling context of TX queue wake (myself) * fix a deadlock with ath10k aggregation (myself) -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEExu3sM/nZ1eRSfR9Ha3t4Rpy0AB0FAlmw78wACgkQa3t4Rpy0 AB1viw/+K2xrwzsKqrNoNM1sV4bPItUTjay64dPVD5CjJ/pAwou6HCu0gCJCh4kt mXhLWHds7Q4sBY+DlN9eIagQLJUaw897FWV+tHHirDGKMsE4tBaIct7PLBpM7r5O H03T5qT9+nDGRAJq6ucLG8v91cTAlBNfEIV73Au9Oi5B0Rq4cs+Tz8xS24EHjfTB zRcLMaE8qoQjIfrwQsYNQBdvYHY5G+Ui5sbPh3HPLDPzAfKAsc75nbikI2QE//s0 cMv5ro39vy0DGyQmdTqNzzzuWWzYvhUD7EiIr7Dfm9ilhljCiVqZg6y7ZVMB/QNq +HRD7ShbTnNMx1fx8w5WO6gKGVSeo0Ga6KKEauTGiWJQTfZQLuIBLylSMVclfvBN 4zOv3vC9EUP5qqPt0cby7VV2D+1Z4Lw2GYZZKHF5numMkgHAoDJ+tJHbBFmz1CEX co/79RFhGLKvZE+8lN40hqvPoYA5NOUO6jyOq384ZbnC190nVqOXvIxi9jmFKBHp rGBE/8e0VPYlc48m6NUFwAvc0HOeN3/ZVaUnoo6SY8fCbru3yhRYzC3pmcepTEbA OVBHirgYtntI2mk4FWd2dkTC6aOfP1o11dwm3deaaEtkwaiKlxI2xfnkbsGaMaOh RW787Y10g0k785ABD/GxynOeqfiXnIxIjMKZiQliR33zxdv4cAI= =QYS4 -----END PGP SIGNATURE----- Merge tag 'mac80211-for-davem-2017-09-07' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211 Johannes Berg says: ==================== Back from a long absence, so we have a number of things: * a remain-on-channel fix from Avi * hwsim TX power fix from Beni * null-PTR dereference with iTXQ in some rare configurations (Chunho) * 40 MHz custom regdomain fixes (Emmanuel) * look at right place in HT/VHT capability parsing (Igor) * complete A-MPDU teardown properly (Ilan) * Mesh ID Element ordering fix (Liad) * avoid tracing warning in ht_dbg() (Sharon) * fix print of assoc/reassoc (Simon) * fix encrypted VLAN with iTXQ (myself) * fix calling context of TX queue wake (myself) * fix a deadlock with ath10k aggregation (myself) ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-07 09:40:58 -07:00
Linus Torvalds	608c1d3c17	Merge branch 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup Pull cgroup updates from Tejun Heo: "Several notable changes this cycle: - Thread mode was merged. This will be used for cgroup2 support for CPU and possibly other controllers. Unfortunately, CPU controller cgroup2 support didn't make this pull request but most contentions have been resolved and the support is likely to be merged before the next merge window. - cgroup.stat now shows the number of descendant cgroups. - cpuset now can enable the easier-to-configure v2 behavior on v1 hierarchy" * 'for-4.14' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (21 commits) cpuset: Allow v2 behavior in v1 cgroup cgroup: Add mount flag to enable cpuset to use v2 behavior in v1 cgroup cgroup: remove unneeded checks cgroup: misc changes cgroup: short-circuit cset_cgroup_from_root() on the default hierarchy cgroup: re-use the parent pointer in cgroup_destroy_locked() cgroup: add cgroup.stat interface with basic hierarchy stats cgroup: implement hierarchy limits cgroup: keep track of number of descent cgroups cgroup: add comment to cgroup_enable_threaded() cgroup: remove unnecessary empty check when enabling threaded mode cgroup: update debug controller to print out thread mode information cgroup: implement cgroup v2 thread support cgroup: implement CSS_TASK_ITER_THREADED cgroup: introduce cgroup->dom_cgrp and threaded css_set handling cgroup: add @flags to css_task_iter_start() and implement CSS_TASK_ITER_PROCS cgroup: reorganize cgroup.procs / task write path cgroup: replace css_set walking populated test with testing cgrp->nr_populated_csets cgroup: distinguish local and children populated states cgroup: remove now unused list_head @pending in cgroup_apply_cftypes() ...	2017-09-06 22:25:25 -07:00
Kleber Sacilotto de Souza	8e0deed924	tipc: remove unnecessary call to dev_net() The net device is already stored in the 'net' variable, so no need to call dev_net() again. Signed-off-by: Kleber Sacilotto de Souza <kleber.souza@canonical.com> Acked-by: Ying Xue <ying.xue@windriver.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-06 21:25:52 -07:00
Xin Long	f773608026	netlink: access nlk groups safely in netlink bind and getname Now there is no lock protecting nlk ngroups/groups' accessing in netlink bind and getname. It's safe from nlk groups' setting in netlink_release, but not from netlink_realloc_groups called by netlink_setsockopt. netlink_lock_table is needed in both netlink bind and getname when accessing nlk groups. Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-06 21:22:54 -07:00
Xin Long	be82485fbc	netlink: fix an use-after-free issue for nlk groups ChunYu found a netlink use-after-free issue by syzkaller: [28448.842981] BUG: KASAN: use-after-free in __nla_put+0x37/0x40 at addr ffff8807185e2378 [28448.969918] Call Trace: [...] [28449.117207] __nla_put+0x37/0x40 [28449.132027] nla_put+0xf5/0x130 [28449.146261] sk_diag_fill.isra.4.constprop.5+0x5a0/0x750 [netlink_diag] [28449.176608] __netlink_diag_dump+0x25a/0x700 [netlink_diag] [28449.202215] netlink_diag_dump+0x176/0x240 [netlink_diag] [28449.226834] netlink_dump+0x488/0xbb0 [28449.298014] __netlink_dump_start+0x4e8/0x760 [28449.317924] netlink_diag_handler_dump+0x261/0x340 [netlink_diag] [28449.413414] sock_diag_rcv_msg+0x207/0x390 [28449.432409] netlink_rcv_skb+0x149/0x380 [28449.467647] sock_diag_rcv+0x2d/0x40 [28449.484362] netlink_unicast+0x562/0x7b0 [28449.564790] netlink_sendmsg+0xaa8/0xe60 [28449.661510] sock_sendmsg+0xcf/0x110 [28449.865631] __sys_sendmsg+0xf3/0x240 [28450.000964] SyS_sendmsg+0x32/0x50 [28450.016969] do_syscall_64+0x25c/0x6c0 [28450.154439] entry_SYSCALL64_slow_path+0x25/0x25 It was caused by no protection between nlk groups' free in netlink_release and nlk groups' accessing in sk_diag_dump_groups. The similar issue also exists in netlink_seq_show(). This patch is to defer nlk groups' free in deferred_put_nlk_sk. Reported-by: ChunYu Wang <chunwang@redhat.com> Acked-by: Florian Westphal <fw@strlen.de> Signed-off-by: Xin Long <lucien.xin@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-06 21:22:53 -07:00
Gao Feng	39ad1297a2	sched: Use __qdisc_drop instead of kfree_skb in sch_prio and sch_qfq The commit 520ac30f4551 ("net_sched: drop packets after root qdisc lock is released) made a big change of tc for performance. There are two points left in sch_prio and sch_qfq which are not changed with that commit. Now enhance them now with __qdisc_drop. Signed-off-by: Gao Feng <gfree.wind@vip.163.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-06 21:20:07 -07:00
Linus Torvalds	aae3dbb477	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next Pull networking updates from David Miller: 1) Support ipv6 checksum offload in sunvnet driver, from Shannon Nelson. 2) Move to RB-tree instead of custom AVL code in inetpeer, from Eric Dumazet. 3) Allow generic XDP to work on virtual devices, from John Fastabend. 4) Add bpf device maps and XDP_REDIRECT, which can be used to build arbitrary switching frameworks using XDP. From John Fastabend. 5) Remove UFO offloads from the tree, gave us little other than bugs. 6) Remove the IPSEC flow cache, from Florian Westphal. 7) Support ipv6 route offload in mlxsw driver. 8) Support VF representors in bnxt_en, from Sathya Perla. 9) Add support for forward error correction modes to ethtool, from Vidya Sagar Ravipati. 10) Add time filter for packet scheduler action dumping, from Jamal Hadi Salim. 11) Extend the zerocopy sendmsg() used by virtio and tap to regular sockets via MSG_ZEROCOPY. From Willem de Bruijn. 12) Significantly rework value tracking in the BPF verifier, from Edward Cree. 13) Add new jump instructions to eBPF, from Daniel Borkmann. 14) Rework rtnetlink plumbing so that operations can be run without taking the RTNL semaphore. From Florian Westphal. 15) Support XDP in tap driver, from Jason Wang. 16) Add 32-bit eBPF JIT for ARM, from Shubham Bansal. 17) Add Huawei hinic ethernet driver. 18) Allow to report MD5 keys in TCP inet_diag dumps, from Ivan Delalande. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next: (1780 commits) i40e: point wb_desc at the nvm_wb_desc during i40e_read_nvm_aq i40e: avoid NVM acquire deadlock during NVM update drivers: net: xgene: Remove return statement from void function drivers: net: xgene: Configure tx/rx delay for ACPI drivers: net: xgene: Read tx/rx delay for ACPI rocker: fix kcalloc parameter order rds: Fix non-atomic operation on shared flag variable net: sched: don't use GFP_KERNEL under spin lock vhost_net: correctly check tx avail during rx busy polling net: mdio-mux: add mdio_mux parameter to mdio_mux_init() rxrpc: Make service connection lookup always check for retry net: stmmac: Delete dead code for MDIO registration gianfar: Fix Tx flow control deactivation cxgb4: Ignore MPS_TX_INT_CAUSE[Bubble] for T6 cxgb4: Fix pause frame count in t4_get_port_stats cxgb4: fix memory leak tun: rename generic_xdp to skb_xdp tun: reserve extra headroom only when XDP is set net: dsa: bcm_sf2: Configure IMP port TC2QOS mapping net: dsa: bcm_sf2: Advertise number of egress queues ...	2017-09-06 14:45:08 -07:00
Douglas Fuller	06d74376c8	ceph: more accurate statfs Improve accuracy of statfs reporting for Ceph filesystems comprising exactly one data pool. In this case, the Ceph monitor can now report the space usage for the single data pool instead of the global data for the entire Ceph cluster. Include support for this message in mon_client and leverage it in ceph/super. Signed-off-by: Douglas Fuller <dfuller@redhat.com> Reviewed-by: Yan, Zheng <zyan@redhat.com> Reviewed-by: Ilya Dryomov <idryomov@gmail.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-09-06 19:56:49 +02:00
Yanhu Cao	3fb99d483e	ceph: nuke startsync op startsync is a no-op, has been for years. Remove it. Link: http://tracker.ceph.com/issues/20604 Signed-off-by: Yanhu Cao <gmayyyha@gmail.com> Reviewed-by: "Yan, Zheng" <zyan@redhat.com> Signed-off-by: Ilya Dryomov <idryomov@gmail.com>	2017-09-06 19:56:43 +02:00
NeilBrown	f1ecbc21eb	SUNRPC: remove some dead code. RPC_TASK_NO_RETRANS_TIMEOUT is set when cl_noretranstimeo is set, which happens when RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT is set, which happens when NFS_CS_NO_RETRANS_TIMEOUT is set. This flag means "don't resend on a timeout, only resend if the connection gets broken for some reason". cl_discrtry is set when RPC_CLNT_CREATE_DISCRTRY is set, which happens when NFS_CS_DISCRTRY is set. This flag means "always disconnect before resending". NFS_CS_NO_RETRANS_TIMEOUT and NFS_CS_DISCRTRY are both only set in nfs4_init_client(), and it always sets both. So we will never have a situation where only one of the flags is set. So this code, which tests if timeout retransmits are allowed, and disconnection is required, will never run. So it makes sense to remove this code as it cannot be tested and could confuse people reading the code (like me). (alternately we could leave it there with a comment saying it is never actually used). Signed-off-by: NeilBrown <neilb@suse.com> Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2017-09-06 12:31:15 -04:00
Johannes Berg	bde59c475e	mac80211: fix deadlock in driver-managed RX BA session start When an RX BA session is started by the driver, and it has to tell mac80211 about it, the corresponding bit in tid_rx_manage_offl gets set and the BA session work is scheduled. Upon testing this bit, it will call __ieee80211_start_rx_ba_session(), thus deadlocking as it already holds the ampdu_mlme.mtx, which that acquires again. Fix this by adding ___ieee80211_start_rx_ba_session(), a version of the function that requires the mutex already held. Cc: stable@vger.kernel.org Fixes: 699cb58c8a52 ("mac80211: manage RX BA session offload without SKB queue") Reported-by: Matteo Croce <mcroce@redhat.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-09-06 15:22:02 +02:00
Ilan peer	98e93e968e	mac80211: Complete ampdu work schedule during session tear down Commit 7a7c0a6438b8 ("mac80211: fix TX aggregation start/stop callback race") added a cancellation of the ampdu work after the loop that stopped the Tx and Rx BA sessions. However, in some cases, e.g., during HW reconfig, the low level driver might call mac80211 APIs to complete the stopping of the BA sessions, which would queue the ampdu work to handle the actual completion. This work needs to be performed as otherwise mac80211 data structures would not be properly synced. Fix this by checking if BA session STOP_CB bit is set after the BA session cancellation and properly clean the session. Signed-off-by: Ilan Peer <ilan.peer@intel.com> [Johannes: the work isn't flushed because that could do other things we don't want, and the locking situation isn't clear] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-09-06 15:22:02 +02:00
Emmanuel Grumbach	4e0854a74f	cfg80211: honor NL80211_RRF_NO_HT40{MINUS,PLUS} Honor the NL80211_RRF_NO_HT40{MINUS,PLUS} flags in reg_process_ht_flags_channel. Not doing so leads can lead to a firmware assert in iwlwifi for example. Fixes: b0d7aa59592b ("cfg80211: allow wiphy specific regdomain management") Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-09-06 12:56:31 +02:00
David S. Miller	18fb0b46d5	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net	2017-09-05 20:03:35 -07:00
Chuck Lever	9590d083c1	xprtrdma: Use xprt_pin_rqst in rpcrdma_reply_handler Adopt the use of xprt_pin_rqst to eliminate contention between Call-side users of rb_lock and the use of rb_lock in rpcrdma_reply_handler. This replaces the mechanism introduced in 431af645cf66 ("xprtrdma: Fix client lock-up after application signal fires"). Use recv_lock to quickly find the completing rqst, pin it, then drop the lock. At that point invalidation and pull-up of the Reply XDR can be done. Both are often expensive operations. Finally, take recv_lock again to signal completion to the RPC layer. It also protects adjustment of "cwnd". This greatly reduces the amount of time a lock is held by the reply handler. Comparing lock_stat results shows a marked decrease in contention on rb_lock and recv_lock. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> [trond.myklebust@primarydata.com: Remove call to rpcrdma_buffer_put() from the "out_norqst:" path in rpcrdma_reply_handler.] Signed-off-by: Trond Myklebust <trond.myklebust@primarydata.com>	2017-09-05 18:27:07 -04:00
Håkon Bugge	f530f39f5f	rds: Fix non-atomic operation on shared flag variable The bits in m_flags in struct rds_message are used for a plurality of reasons, and from different contexts. To avoid any missing updates to m_flags, use the atomic set_bit() instead of the non-atomic equivalent. Signed-off-by: Håkon Bugge <haakon.bugge@oracle.com> Reviewed-by: Knut Omang <knut.omang@oracle.com> Reviewed-by: Wei Lin Guay <wei.lin.guay@oracle.com> Acked-by: Santosh Shilimkar <santosh.shilimkar@oracle.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-05 14:49:49 -07:00
Jakub Kicinski	2c8468dcf8	net: sched: don't use GFP_KERNEL under spin lock The new TC IDR code uses GFP_KERNEL under spin lock. Which leads to: [ 582.621091] BUG: sleeping function called from invalid context at ../mm/slab.h:416 [ 582.629721] in_atomic(): 1, irqs_disabled(): 0, pid: 3379, name: tc [ 582.636939] 2 locks held by tc/3379: [ 582.641049] #0: (rtnl_mutex){+.+.+.}, at: [<ffffffff910354ce>] rtnetlink_rcv_msg+0x92e/0x1400 [ 582.650958] #1: (&(&tn->idrinfo->lock)->rlock){+.-.+.}, at: [<ffffffff9110a5e0>] tcf_idr_create+0x2f0/0x8e0 [ 582.662217] Preemption disabled at: [ 582.662222] [<ffffffff9110a5e0>] tcf_idr_create+0x2f0/0x8e0 [ 582.672592] CPU: 9 PID: 3379 Comm: tc Tainted: G W 4.13.0-rc7-debug-00648-g43503a79b9f0 #287 [ 582.683432] Hardware name: Dell Inc. PowerEdge R730/072T6D, BIOS 2.3.4 11/08/2016 [ 582.691937] Call Trace: ... [ 582.742460] kmem_cache_alloc+0x286/0x540 [ 582.747055] radix_tree_node_alloc.constprop.6+0x4a/0x450 [ 582.753209] idr_get_free_cmn+0x627/0xf80 ... [ 582.815525] idr_alloc_cmn+0x1a8/0x270 ... [ 582.833804] tcf_idr_create+0x31b/0x8e0 ... Try to preallocate the memory with idr_prealloc(GFP_KERNEL) (as suggested by Eric Dumazet), and change the allocation flags under spin lock. Fixes: 65a206c01e8e ("net/sched: Change act_api and act_xxx modules to use IDR") Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com> Reviewed-by: Simon Horman <simon.horman@netronome.com> Acked-by: Eric Dumazet <edumazet@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-05 14:48:29 -07:00
David Howells	fdade4f69e	rxrpc: Make service connection lookup always check for retry When an RxRPC service packet comes in, the target connection is looked up by an rb-tree search under RCU and a read-locked seqlock; the seqlock retry check is, however, currently skipped if we got a match, but probably shouldn't be in case the connection we found gets replaced whilst we're doing a search. Make the lookup procedure always go through need_seqretry(), even if the lookup was successful. This makes sure we always pick up on a write-lock event. On the other hand, since we don't take a ref on the object, but rely on RCU to prevent its destruction after dropping the seqlock, I'm not sure this is necessary. Signed-off-by: David Howells <dhowells@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-05 14:39:17 -07:00
Trond Myklebust	f9773b22a2	NFS-over-RDMA client updates for Linux 4.14 Bugfixes and cleanups: - Constify rpc_xprt_ops - Harden RPC call encoding and decoding - Clean up rpc call decoding to use xdr_streams - Remove unused variables from various structures - Refactor code to remove imul instructions - Rearrange rx_stats structure for better cacheline sharing -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEnZ5MQTpR7cLU7KEp18tUv7ClQOsFAlmgfA4ACgkQ18tUv7Cl QOsbXBAAnNaCWwerMGi7IbPcvA8aIQLcaruVUVuI2HIUdwb0At3EBakLJr5vFong IbUPEegi2F7Dm8gwwQ8Ntb0gqGER1mHr0Bd4tcls+cNxwKNpRad/cv8ZjN4AMVpz Kf1ZQOSDoRyJxwnAaRTYsU302tkWQFHrBjpCXpvgI3uoQ7kJwC1sZpXH6qN+r9E3 hFlkzZJ6gkZE3Rx3XsQqjl+TFZ3amd9Yl1AjzND622oLItmcJiRoptCVz8jYEFBJ uYvg22jbZWIrI66pPXnX+TuDfkbA6nFuSqJma0VLZAyTGKtRzJpaExvSJuuMqLm1 ZuWgWXIO3Kvvyx4gTvRFq06TAlunjOHlxb+39Yr41w2LLcDitvTmv2t/o8+BcVCp fkaziwZIqkfXoE4+3SGRC0s+R5obtgjAiTlAPTwno9p8T7jC+x43fdPF9l5jgAs+ 0jtl1d+whQK0yGITq7zwbLimLxxz12f8S9JH6U4umkL/A458ApRVuUQfoCHzl4wk ZPG1DGZjPBClM3R//XfUargfs/uM2FO6u0Z4+mxxdyJAHrdExczDC6OE9lLG9hnR KQEa7PVDjQZssNHOY0Nu3QaTpBoVxmN6xiDMTtXdf+ltd2m/ja18lER3tB9IwpXD +RqIJ8aFat3oP76tZ8CNJ7LiRORzmqDTcfjWkpCDPK259OK7FFU= =fdZG -----END PGP SIGNATURE----- Merge tag 'nfs-rdma-for-4.14-1' of git://git.linux-nfs.org/projects/anna/linux-nfs into linux-next NFS-over-RDMA client updates for Linux 4.14 Bugfixes and cleanups: - Constify rpc_xprt_ops - Harden RPC call encoding and decoding - Clean up rpc call decoding to use xdr_streams - Remove unused variables from various structures - Refactor code to remove imul instructions - Rearrange rx_stats structure for better cacheline sharing	2017-09-05 15:16:04 -04:00
Chuck Lever	26fb2254dd	svcrdma: Estimate Send Queue depth properly The rdma_rw API adjusts max_send_wr upwards during the rdma_create_qp() call. If the ULP actually wants to take advantage of these extra resources, it must increase the size of its send completion queue (created before rdma_create_qp is called) and increase its send queue accounting limit. Use the new rdma_rw_mr_factor API to figure out the correct value to use for the Send Queue and Send Completion Queue depths. And, ensure that the chosen Send Queue depth for a newly created transport does not overrun the QP WR limit of the underlying device. Lastly, there's no longer a need to carry the Send Queue depth in struct svcxprt_rdma, since the value is used only in the svc_rdma_accept() path. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-09-05 15:15:31 -04:00
Chuck Lever	5a25bfd28c	svcrdma: Limit RQ depth Ensure that the chosen Receive Queue depth for a newly created transport does not overrun the QP WR limit of the underlying device. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-09-05 15:15:30 -04:00
Chuck Lever	193bcb7b37	svcrdma: Populate tail iovec when receiving So that NFS WRITE payloads can eventually be placed directly into a file's page cache, enable the RPC-over-RDMA transport to present these payloads in the xdr_buf's page list, while placing trailing content (such as a GETATTR operation) in the xdr_buf's tail. After this change, the RPC-over-RDMA's "copy tail" hack, added by commit a97c331f9aa9 ("svcrdma: Handle additional inline content"), is no longer needed and can be removed. Signed-off-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com>	2017-09-05 15:15:29 -04:00
J. Bruce Fields	0828170f3d	merge nfsd 4.13 bugfixes into nfsd for-4.14 branch	2017-09-05 15:11:47 -04:00
Linus Torvalds	b42a362e6d	Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid Pull HID update from Jiri Kosina: - Wacom driver fixes/updates (device name generation improvements, touch ring status support) from Jason Gerecke - T100 touchpad support from Hans de Goede - support for batteries driven by HID input reports, from Dmitry Torokhov - Arnd pointed out that driver_lock semaphore is superfluous, as driver core already provides all the necessary concurency protection. Removal patch from Binoy Jayan - logical minimum numbering improvements in sensor-hub driver, from Srinivas Pandruvada - support for Microsoft Win8 Wireless Radio Controls extensions from João Paulo Rechi Vita - assorted small fixes and device ID additions * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid: (28 commits) HID: prodikeys: constify snd_rawmidi_ops structures HID: sensor: constify platform_device_id HID: input: throttle battery uevents HID: usbmouse: constify usb_device_id and fix space before '[' error HID: usbkbd: constify usb_device_id and fix space before '[' error. HID: hid-sensor-hub: Force logical minimum to 1 for power and report state HID: wacom: Do not completely map WACOM_HID_WD_TOUCHRINGSTATUS usage HID: asus: Add T100CHI bluetooth keyboard dock touchpad support HID: ntrig: constify attribute_group structures. HID: logitech-hidpp: constify attribute_group structures. HID: sensor: constify attribute_group structures. HID: multitouch: constify attribute_group structures. HID: multitouch: use proper symbolic constant for 0xff310076 application HID: multitouch: Support Asus T304UA media keys HID: multitouch: Support HID_GD_WIRELESS_RADIO_CTLS HID: input: optionally use device id in battery name HID: input: map digitizer battery usage HID: Remove the semaphore driver_lock HID: wacom: add USB_HID dependency HID: add ALWAYS_POLL quirk for Logitech 0xc077 ...	2017-09-05 11:54:41 -07:00
Florian Fainelli	0f15b09869	net: dsa: tag_brcm: Set output queue from skb queue mapping We originally used skb->priority but that was not quite correct as this bitfield needs to contain the egress switch queue we intend to send this SKB to. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-05 11:53:34 -07:00
Florian Fainelli	55199df6d2	net: dsa: Allow switch drivers to indicate number of TX queues Let switch drivers indicate how many TX queues they support. Some switches, such as Broadcom Starfighter 2 are designed with 8 egress queues. Future changes will allow us to leverage the queue mapping and direct the transmission towards a particular queue. Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-05 11:53:34 -07:00
Ido Schimmel	f1c2eddf4c	bridge: switchdev: Use an helper to clear forward mark Instead of using ifdef in the C file. Signed-off-by: Ido Schimmel <idosch@mellanox.com> Suggested-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Tested-by: Yotam Gigi <yotamg@mellanox.com> Acked-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-05 11:51:47 -07:00
Tom Herbert	1eed4dfb81	flow_dissector: Add limit for number of headers to dissect In flow dissector there are no limits to the number of nested encapsulations or headers that might be dissected which makes for a nice DOS attack. This patch sets a limit of the number of headers that flow dissector will parse. Headers includes network layer headers, transport layer headers, shim headers for encapsulation, IPv6 extension headers, etc. The limit for maximum number of headers to parse has be set to fifteen to account for a reasonable number of encapsulations, extension headers, VLAN, in a packet. Note that this limit does not supercede the STOP_AT_* flags which may stop processing before the headers limit is reached. Reported-by: Hannes Frederic Sowa <hannes@stressinduktion.org> Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-05 11:40:08 -07:00
Tom Herbert	3a1214e8b0	flow_dissector: Cleanup control flow __skb_flow_dissect is riddled with gotos that make discerning the flow, debugging, and extending the capability difficult. This patch reorganizes things so that we only perform goto's after the two main switch statements (no gotos within the cases now). It also eliminates several goto labels so that there are only two labels that can be target for goto. Reported-by: Alexander Popov <alex.popov@linux.com> Signed-off-by: Tom Herbert <tom@quantonium.net> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-05 11:40:08 -07:00
Arnd Bergmann	fd0c88b700	net/ncsi: fix ncsi_vlan_rx_{add,kill}_vid references We get a new link error in allmodconfig kernels after ftgmac100 started using the ncsi helpers: ERROR: "ncsi_vlan_rx_kill_vid" [drivers/net/ethernet/faraday/ftgmac100.ko] undefined! ERROR: "ncsi_vlan_rx_add_vid" [drivers/net/ethernet/faraday/ftgmac100.ko] undefined! Related to that, we get another error when CONFIG_NET_NCSI is disabled: drivers/net/ethernet/faraday/ftgmac100.c:1626:25: error: 'ncsi_vlan_rx_add_vid' undeclared here (not in a function); did you mean 'ncsi_start_dev'? drivers/net/ethernet/faraday/ftgmac100.c:1627:26: error: 'ncsi_vlan_rx_kill_vid' undeclared here (not in a function); did you mean 'ncsi_vlan_rx_add_vid'? This fixes both problems at once, using a 'static inline' stub helper for the disabled case, and exporting the functions when they are present. Fixes: 51564585d8c6 ("ftgmac100: Support NCSI VLAN filtering when available") Fixes: 21acf63013ed ("net/ncsi: Configure VLAN tag filter") Signed-off-by: Arnd Bergmann <arnd@arndb.de> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-05 09:11:45 -07:00
Igor Mitsyanko	ba83bfb1e8	nl80211: look for HT/VHT capabilities in beacon's tail There are no HT/VHT capabilities in cfg80211_ap_settings::beacon_ies, these should be looked for in beacon's tail instead. Fixes: 66cd794e3c30 ("nl80211: add HT/VHT capabilities to AP parameters") Signed-off-by: Igor Mitsyanko <igor.mitsyanko.os@quantenna.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-09-05 16:25:08 +02:00
Avraham Stern	6e46d8ce89	mac80211: flush hw_roc_start work before cancelling the ROC When HW ROC is supported it is possible that after the HW notified that the ROC has started, the ROC was cancelled and another ROC was added while the hw_roc_start worker is waiting on the mutex (since cancelling the ROC and adding another one also holds the same mutex). As a result, the hw_roc_start worker will continue to run after the new ROC is added but before it is actually started by the HW. This may result in notifying userspace that the ROC has started before it actually does, or in case of management tx ROC, in an attempt to tx while not on the right channel. In addition, when the driver will notify mac80211 that the second ROC has started, mac80211 will warn that this ROC has already been notified. Fix this by flushing the hw_roc_start work before cancelling an ROC. Cc: stable@vger.kernel.org Signed-off-by: Avraham Stern <avraham.stern@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-09-05 16:25:07 +02:00
Johannes Berg	979e1f0804	mac80211: agg-tx: call drv_wake_tx_queue in proper context Since drv_wake_tx_queue() is normally called in the TX path, which is already in an RCU critical section, we should call it the same way in the aggregation code path, so if the driver expects to be able to use RCU, it'll already be protected without having to enter a nested critical section. Additionally, disable soft-IRQs, since not doing so could cause issues in a driver that relies on them already being disabled like in the other path. Fixes: ba8c3d6f16a1 ("mac80211: add an intermediate software queue implementation") Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-09-05 16:25:07 +02:00
Chunho Lee	89e9bfc4ee	mac80211: Fix null pointer dereference with iTXQ support This change adds null pointer check before dereferencing pointer dev on netif_tx_start_all_queues() when an interface is added. With iTXQ support, netif_tx_start_all_queues() is always called while an interface is added. however, the netdev queues are not associated and dev is null when the interface is either NL80211_IFTYPE_P2P_DEVICE or NL80211_IFTYPE_NAN. Signed-off-by: Chunho Lee <ch.lee@newracom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-09-05 11:28:51 +02:00
Liad Kaufman	b44eebea18	mac80211: add MESH IE in the correct order VHT MESH support was added, but the order of the IEs wasn't enforced. Fix that. Signed-off-by: Liad Kaufman <liad.kaufman@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-09-05 11:28:51 +02:00
Sharon Dvir	d81b0fd0e7	mac80211: shorten debug prints using ht_dbg() to avoid warning Invoking ht_dbg() with too long of a string will print a warning. Shorten the messages while retaining the printed patameters. Signed-off-by: Sharon Dvir <sharon.dvir@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-09-05 11:28:50 +02:00
Johannes Berg	5316821590	mac80211: fix VLAN handling with TXQs With TXQs, the AP_VLAN interfaces are resolved to their owner AP interface when enqueuing the frame, which makes sense since the frame really goes out on that as far as the driver is concerned. However, this introduces a problem: frames to be encrypted with a VLAN-specific GTK will now be encrypted with the AP GTK, since the information about which virtual interface to use to select the key is taken from the TXQ. Fix this by preserving info->control.vif and using that in the dequeue function. This now requires doing the driver-mapping in the dequeue as well. Since there's no way to filter the frames that are sitting on a TXQ, drop all frames, which may affect other interfaces, when an AP_VLAN is removed. Cc: stable@vger.kernel.org Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-09-05 11:28:43 +02:00
Jiri Kosina	de6c5070ad	Merge branch 'for-4.14/wacom' into for-linus - name generation improvement for Wacom devices from Jason Gerecke - Kconfig dependency fix for Wacom driver from Arnd Bergmann	2017-09-05 11:14:10 +02:00
Simon Dinkin	7a7d3e4c03	mac80211: fix incorrect assignment of reassoc value this fix minor issue in the log message. in ieee80211_rx_mgmt_assoc_resp function, when assigning the reassoc value from the mgmt frame control: ieee80211_is_reassoc_resp function need to be used, instead of ieee80211_is_reassoc_req function. Signed-off-by: Simon Dinkin <simon.dinkin@tandemg.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-09-05 09:04:20 +02:00
Christoph Hellwig	670986ec01	net/9p: switch p9_fd_read to kernel_write Instead of playing with the addressing limits. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-09-04 19:05:16 -04:00
Christoph Hellwig	bdd1d2d3d2	fs: fix kernel_read prototype Use proper ssize_t and size_t types for the return value and count argument, move the offset last and make it an in/out argument like all other read/write helpers, and make the buf argument a void pointer to get rid of lots of casts in the callers. Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-09-04 19:05:15 -04:00
David S. Miller	2ff81cd35f	Merge git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next Pablo Neira Ayuso says: ==================== Netfilter updates for next-net (part 2) The following patchset contains Netfilter updates for net-next. This patchset includes updates for nf_tables, removal of CONFIG_NETFILTER_DEBUG and a new mode for xt_hashlimit. More specifically, they: 1) Add new rate match mode for hashlimit, this introduces a new revision for this match. The idea is to stop matching packets until ratelimit criteria stands true. Patch from Vishwanath Pai. 2) Add ->select_ops indirection to nf_tables named objects, so we can choose between different flavours of the same object type, patch from Pablo M. Bermudo. 3) Shorter function names in nft_limit, basically: s/nft_limit_pkt_bytes/nft_limit_bytes, also from Pablo M. Bermudo. 4) Add new stateful limit named object type, this allows us to create limit policies that you can identify via name, also from Pablo. 5) Remove unused hooknum parameter in conntrack ->packet indirection. From Florian Westphal. 6) Patches to remove CONFIG_NETFILTER_DEBUG and macros such as IP_NF_ASSERT and IP_NF_ASSERT. From Varsha Rao. 7) Add nf_tables_updchain() helper function and use it from nf_tables_newchain() to make it more maintainable. Similarly, add nf_tables_addchain() and use it too. 8) Add new netlink NLM_F_NONREC flag, this flag should only be used for deletion requests, specifically, to support non-recursive deletion. Based on what we discussed during NFWS'17 in Faro. 9) Use NLM_F_NONREC from table and sets in nf_tables. 10) Support for recursive chain deletion. Table and set deletion commands come with an implicit content flush on deletion, while chains do not. This patch addresses this inconsistency by adding the code to perform recursive chain deletions. This also comes with the bits to deal with the new NLM_F_NONREC netlink flag. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-09-04 15:27:49 -07:00
Linus Torvalds	5f82e71a00	Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull locking updates from Ingo Molnar: - Add 'cross-release' support to lockdep, which allows APIs like completions, where it's not the 'owner' who releases the lock, to be tracked. It's all activated automatically under CONFIG_PROVE_LOCKING=y. - Clean up (restructure) the x86 atomics op implementation to be more readable, in preparation of KASAN annotations. (Dmitry Vyukov) - Fix static keys (Paolo Bonzini) - Add killable versions of down_read() et al (Kirill Tkhai) - Rework and fix jump_label locking (Marc Zyngier, Paolo Bonzini) - Rework (and fix) tlb_flush_pending() barriers (Peter Zijlstra) - Remove smp_mb__before_spinlock() and convert its usages, introduce smp_mb__after_spinlock() (Peter Zijlstra) * 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (56 commits) locking/lockdep/selftests: Fix mixed read-write ABBA tests sched/completion: Avoid unnecessary stack allocation for COMPLETION_INITIALIZER_ONSTACK() acpi/nfit: Fix COMPLETION_INITIALIZER_ONSTACK() abuse locking/pvqspinlock: Relax cmpxchg's to improve performance on some architectures smp: Avoid using two cache lines for struct call_single_data locking/lockdep: Untangle xhlock history save/restore from task independence locking/refcounts, x86/asm: Disable CONFIG_ARCH_HAS_REFCOUNT for the time being futex: Remove duplicated code and fix undefined behaviour Documentation/locking/atomic: Finish the document... locking/lockdep: Fix workqueue crossrelease annotation workqueue/lockdep: 'Fix' flush_work() annotation locking/lockdep/selftests: Add mixed read-write ABBA tests mm, locking/barriers: Clarify tlb_flush_pending() barriers locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE and CONFIG_LOCKDEP_COMPLETIONS truly non-interactive locking/lockdep: Explicitly initialize wq_barrier::done::map locking/lockdep: Rename CONFIG_LOCKDEP_COMPLETE to CONFIG_LOCKDEP_COMPLETIONS locking/lockdep: Reword title of LOCKDEP_CROSSRELEASE config locking/lockdep: Make CONFIG_LOCKDEP_CROSSRELEASE part of CONFIG_PROVE_LOCKING locking/refcounts, x86/asm: Implement fast refcount overflow protection locking/lockdep: Fix the rollback and overwrite detection logic in crossrelease ...	2017-09-04 11:52:29 -07:00
Pablo Neira Ayuso	9dee147412	netfilter: nf_tables: support for recursive chain deletion This patch sorts out an asymmetry in deletions. Currently, table and set deletion commands come with an implicit content flush on deletion. However, chain deletion results in -EBUSY if there is content in this chain, so no implicit flush happens. So you have to send a flush command in first place to delete chains, this is inconsistent and it can be annoying in terms of user experience. This patch uses the new NLM_F_NONREC flag to request non-recursive chain deletion, ie. if the chain to be removed contains rules, then this returns EBUSY. This problem was discussed during the NFWS'17 in Faro, Portugal. In iptables, you hit -EBUSY if you try to delete a chain that contains rules, so you have to flush first before you can remove anything. Since iptables-compat uses the nf_tables netlink interface, it has to use the NLM_F_NONREC flag from userspace to retain the original iptables semantics, ie. bail out on removing chains that contain rules. Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>	2017-09-04 17:34:55 +02:00

1 2 3 4 5 ...

47906 Commits