linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-01-13 08:39:52 +00:00

Author	SHA1	Message	Date
Sudeep Holla	cc6eeaa302	mailbox: skip complete wait event if timer expired If a wait_for_completion_timeout() call returns due to a timeout, complete() can get called after returning from the wait which is incorrect and can cause subsequent transmissions on a channel to fail. Since the wait_for_completion_timeout() sees the completion variable is non-zero caused by the erroneous/spurious complete() call, and it immediately returns without waiting for the time as expected by the client. This patch fixes the issue by skipping complete() call for the timer expiry. Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox") Reported-by: Alexey Klimov <alexey.klimov@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>	2017-04-27 16:20:04 +05:30
Sudeep Holla	c61b781ee0	mailbox: always wait in mbox_send_message for blocking Tx mode There exists a race when msg_submit return immediately as there was an active request being processed which may have completed just before it's checked again in mbox_send_message. This will result in return to the caller without waiting in mbox_send_message even when it's blocking Tx. This patch fixes the issue by waiting for the completion always if Tx is in blocking mode. Fixes: 2b6d83e2b8b7 ("mailbox: Introduce framework for mailbox") Reported-by: Alexey Klimov <alexey.klimov@arm.com> Signed-off-by: Sudeep Holla <sudeep.holla@arm.com> Reviewed-by: Alexey Klimov <alexey.klimov@arm.com> Signed-off-by: Jassi Brar <jaswinder.singh@linaro.org>	2017-04-27 16:20:04 +05:30
Sabrina Dubroca	cfcf99f987	xfrm: fix GRO for !CONFIG_NETFILTER In xfrm_input() when called from GRO, async == 0, and we end up skipping the processing in xfrm4_transport_finish(). GRO path will always skip the NF_HOOK, so we don't need the special-case for !NETFILTER during GRO processing. Fixes: 7785bba299a8 ("esp: Add a software GRO codepath") Signed-off-by: Sabrina Dubroca <sd@queasysnail.net> Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>	2017-04-27 12:20:19 +02:00
Oliver Hartkopp	c2701b370e	can: fix CAN BCM build with CONFIG_PROC_FS disabled The introduced namespace support moved the BCM variables for procfs into a per-net data structure. This leads to a build failure with disabled procfs: on x86_64: when CONFIG_PROC_FS is not enabled: ../net/can/bcm.c:1541:14: error: 'struct netns_can' has no member named 'bcmproc_dir' ../net/can/bcm.c:1601:14: error: 'struct netns_can' has no member named 'bcmproc_dir' ../net/can/bcm.c:1696:11: error: 'struct netns_can' has no member named 'bcmproc_dir' ../net/can/bcm.c:1707:15: error: 'struct netns_can' has no member named 'bcmproc_dir' http://marc.info/?l=linux-can&m=149321842526524&w=2 Reported-by: Randy Dunlap <rdunlap@infradead.org> Signed-off-by: Oliver Hartkopp <socketcan@hartkopp.net> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>	2017-04-27 09:34:13 +02:00
Frederic Weisbecker	25e2d8c1b9	sched/cputime: Fix ksoftirqd cputime accounting regression irq_time_read() returns the irqtime minus the ksoftirqd time. This is necessary because irq_time_read() is used to substract the IRQ time from the sum_exec_runtime of a task. If we were to include the softirq time of ksoftirqd, this task would substract its own CPU time everytime it updates ksoftirqd->sum_exec_runtime which would therefore never progress. But this behaviour got broken by: a499a5a14db ("sched/cputime: Increment kcpustat directly on irqtime account") ... which now includes ksoftirqd softirq time in the time returned by irq_time_read(). This has resulted in wrong ksoftirqd cputime reported to userspace through /proc/stat and thus "top" not showing ksoftirqd when it should after intense networking load. ksoftirqd->stime happens to be correct but it gets scaled down by sum_exec_runtime through task_cputime_adjusted(). To fix this, just account the strict IRQ time in a separate counter and use it to report the IRQ time. Reported-and-tested-by: Jesper Dangaard Brouer <brouer@redhat.com> Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com> Reviewed-by: Rik van Riel <riel@redhat.com> Acked-by: Jesper Dangaard Brouer <brouer@redhat.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stanislaw Gruszka <sgruszka@redhat.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Wanpeng Li <wanpeng.li@hotmail.com> Link: http://lkml.kernel.org/r/1493129448-5356-1-git-send-email-fweisbec@gmail.com Signed-off-by: Ingo Molnar <mingo@kernel.org>	2017-04-27 09:08:26 +02:00
Christoph Hellwig	7569b90a22	nvme-scsi: remove nvme_trans_security_protocol This function just returns the same error code and sense data as the default statement in the switch in the caller. Signed-off-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Keith Busch <keith.busch@intel.com>	2017-04-27 08:39:32 +02:00
Martin Schwidefsky	d0790fb6e5	Merge branch 's390forkvm' of git://git.kernel.org/pub/scm/linux/kernel/git/kvms390/linux into features Pull cpacf changes for KVM from Jason Herne: Add query support for the KMA instruction.	2017-04-27 07:34:07 +02:00
David S. Miller	b1513c3531	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 22:39:08 -04:00
Dmitry V. Levin	1741937d47	uapi: change the type of struct statx_timestamp.tv_nsec to unsigned The comment asserting that the value of struct statx_timestamp.tv_nsec must be negative when statx_timestamp.tv_sec is negative, is wrong, as could be seen from the following example: #define _FILE_OFFSET_BITS 64 #include <assert.h> #include <fcntl.h> #include <stdio.h> #include <sys/stat.h> #include <unistd.h> #include <asm/unistd.h> #include <linux/stat.h> int main(void) { static const struct timespec ts[2] = { { .tv_nsec = UTIME_OMIT }, { .tv_sec = -2, .tv_nsec = 42 } }; assert(utimensat(AT_FDCWD, ".", ts, 0) == 0); struct stat st; assert(stat(".", &st) == 0); printf("st_mtim.tv_sec = %lld, st_mtim.tv_nsec = %lu\n", (long long) st.st_mtim.tv_sec, (unsigned long) st.st_mtim.tv_nsec); struct statx stx; assert(syscall(__NR_statx, AT_FDCWD, ".", 0, 0, &stx) == 0); printf("stx_mtime.tv_sec = %lld, stx_mtime.tv_nsec = %lu\n", (long long) stx.stx_mtime.tv_sec, (unsigned long) stx.stx_mtime.tv_nsec); return 0; } It expectedly prints: st_mtim.tv_sec = -2, st_mtim.tv_nsec = 42 stx_mtime.tv_sec = -2, stx_mtime.tv_nsec = 42 The more generic comment asserting that the value of struct statx_timestamp.tv_nsec might be negative is confusing to say the least. It contradicts both the struct stat.st_[acm]time_nsec tradition and struct timespec.tv_nsec requirements in utimensat syscall. If statx syscall ever returns a stx_[acm]time containing a negative tv_nsec that cannot be passed unmodified to utimensat syscall, it will cause an immense confusion. Fix this source of confusion by changing the type of struct statx_timestamp.tv_nsec from __s32 to __u32. Fixes: a528d35e8bfc ("statx: Add a system call to make enhanced file info available") Signed-off-by: Dmitry V. Levin <ldv@altlinux.org> Signed-off-by: David Howells <dhowells@redhat.com> cc: linux-api@vger.kernel.org cc: mtk.manpages@gmail.com Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>	2017-04-26 21:19:05 -04:00
Linus Torvalds	f83246089c	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc Pull sparc fixes from David Miller: "I didn't want the release to go out without the statx system call properly hooked up" * git://git.kernel.org/pub/scm/linux/kernel/git/davem/sparc: sparc: Update syscall tables. sparc64: Fill in rest of HAVE_REGS_AND_STACK_ACCESS_API	2017-04-26 15:10:45 -07:00
David Howells	1e2f82d1e9	statx: Kill fd-with-NULL-path support in favour of AT_EMPTY_PATH With the new statx() syscall, the following both allow the attributes of the file attached to a file descriptor to be retrieved: statx(dfd, NULL, 0, ...); and: statx(dfd, "", AT_EMPTY_PATH, ...); Change the code to reject the first option, though this means copying the path and engaging pathwalk for the fstat() equivalent. dfd can be a non-directory provided path is "". [ The timing of this isn't wonderful, but applying this now before we have statx() in any released kernel, before anybody starts using the NULL special case. - Linus ] Fixes: a528d35e8bfc ("statx: Add a system call to make enhanced file info available") Reported-by: Michael Kerrisk <mtk.manpages@gmail.com> Signed-off-by: David Howells <dhowells@redhat.com> cc: Eric Sandeen <sandeen@sandeen.net> cc: fstests@vger.kernel.org cc: linux-api@vger.kernel.org cc: linux-man@vger.kernel.org Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>	2017-04-26 15:05:47 -07:00
Luca Coelho	e5f2e0671e	mac80211: make multicast variable a bool in ieee80211_accept_frame() The multicast variable in the ieee80211_accept_frame() function is treated as a boolean, but defined as int. Fix that. Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-04-26 23:17:44 +02:00
Johannes Berg	4a19906823	mac80211: disentangle iflist_mtx and chanctx_mtx At least on iwlwifi, sometimes lockdep complains that we can lock chanctx_mtx -> mvm.mutex -> iflist_mtx (due to iterate_interfaces) and iflist_mtx -> chanctx_mtx Remove the latter dependency in mac80211 by using the RTNL that we already hold in one case, and can relatively easily achieve in the other case. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-04-26 23:17:44 +02:00
Emmanuel Grumbach	cf147085fd	mac80211: don't parse encrypted management frames in ieee80211_frame_acked ieee80211_frame_acked is called when a frame is acked by the peer. In case this is a management frame, we check if this an SMPS frame, in which case we can update our antenna configuration. When we parse the management frame we look at the category in case it is an action frame. That byte sits after the IV in case the frame was encrypted. This means that if the frame was encrypted, we basically look at the IV instead of looking at the category. It is then theorically possible that we think that an SMPS action frame was acked where really we had another frame that was encrypted. Since the only management frame whose ack needs to be tracked is the SMPS action frame, and that frame is not a robust management frame, it will never be encrypted. The easiest way to fix this problem is then to not look at frames that were encrypted. Signed-off-by: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-04-26 23:17:43 +02:00
Johannes Berg	f6601e176c	ieee80211: fix kernel-doc parsing errors Some of the enum definitions are unnamed but there's still an attempt at documenting them - that doesn't work. Name them to make that work. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-04-26 23:17:42 +02:00
Luca Coelho	2ead3235fd	ieee80211: add FT-802.1X AKM suite selector Add the definition for FT-8021.1X AKM selector as defined in IEEE Std 802.11-2016, table 9-133. Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-04-26 23:17:42 +02:00
Luca Coelho	1cbf41dbac	ieee80211: add SUITE_B AKM selectors Add the definitions for SUITE_B and SUITE_B_192 AKM selectors as defined in IEEE802.11REVmc_D5.0, table 9-132. Signed-off-by: Luca Coelho <luciano.coelho@intel.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-04-26 23:17:41 +02:00
Arend Van Spriel	3a3ecf1d59	cfg80211: add request id parameter to .sched_scan_stop() signature For multiple scheduled scan support the driver needs to know which scheduled scan request is being stopped. Pass the request id in the .sched_scan_stop() callback. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-04-26 23:17:40 +02:00
Arend Van Spriel	3007e3529c	nl80211: add support for BSSIDs in scheduled scan matchsets This patch allows for the scheduled scan request to specify matchsets for specific BSSIDs. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> [docs, netlink policy fix] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-04-26 23:17:39 +02:00
Arend Van Spriel	ca986ad9bc	nl80211: allow multiple active scheduled scan requests This patch implements the idea to have multiple scheduled scan requests running concurrently. It mainly illustrates how to deal with the incoming request from user-space in terms of backward compatibility. In order to use multiple scheduled scans user-space needs to provide a flag attribute NL80211_ATTR_SCHED_SCAN_MULTI to indicate support. If not the request is treated as a legacy scan. Drivers currently supporting scheduled scan are now indicating they support a single scheduled scan request. This obsoletes WIPHY_FLAG_SUPPORTS_SCHED_SCAN. Reviewed-by: Hante Meuleman <hante.meuleman@broadcom.com> Reviewed-by: Pieter-Paul Giesberts <pieter-paul.giesberts@broadcom.com> Reviewed-by: Franky Lin <franky.lin@broadcom.com> Signed-off-by: Arend van Spriel <arend.vanspriel@broadcom.com> [clean up netlink destroy path to avoid allocations, code cleanups] Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-04-26 23:17:38 +02:00
Johannes Berg	ab81007a7b	cfg80211: simplify netlink socket owner interface deletion There's no need to allocate a portid structure and then, for each of those, walk the interfaces - we can just add a flag to each interface and walk those directly. Due to padding in the struct, we can even do it without any memory cost, and it even simplifies the code. Signed-off-by: Johannes Berg <johannes.berg@intel.com>	2017-04-26 23:17:35 +02:00
Bart Van Assche	0eebd005dd	scsi: Implement blk_mq_ops.show_rq() Show the SCSI CDB for pending SCSI commands in /sys/kernel/debug/block//mq//dispatch and */rq_list. An example of how SCSI commands are displayed by this code: ffff8801703245c0 {.op=READ, .cmd_flags=META PRIO, .rq_flags=DONTPREP IO_STAT STATS, .tag=14, .internal_tag=-1, .cmd=Read(10) 28 00 2a 81 1b 30 00 00 08 00} Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Cc: Martin K. Petersen <martin.petersen@oracle.com> Cc: James Bottomley <James.Bottomley@HansenPartnership.com> Cc: Hannes Reinecke <hare@suse.com> Cc: <linux-scsi@vger.kernel.org> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-26 15:09:04 -06:00
Bart Van Assche	2836ee4b1a	blk-mq: Add blk_mq_ops.show_rq() This new callback function will be used in the next patch to show more information about SCSI requests. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-26 15:09:04 -06:00
Bart Van Assche	8658dca8bd	blk-mq: Show operation, cmd_flags and rq_flags names Show the operation name, .cmd_flags and .rq_flags as names instead of numbers. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-26 15:09:04 -06:00
Bart Van Assche	fd07dc8185	blk-mq: Make blk_flags_show() callers append a newline character This patch does not change any functionality but makes it possible to produce a single line of output with multiple flag-to-name translations. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-26 15:09:04 -06:00
Bart Van Assche	65ca1ca32c	blk-mq: Move the "state" debugfs attribute one level down Move the "state" attribute from the top level to the "mq" directory as requested by Omar. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-26 15:09:04 -06:00
Bart Van Assche	e869b5462f	blk-mq: Unregister debugfs attributes earlier We currently call blk_mq_free_queue() from blk_cleanup_queue() before we unregister the debugfs attributes for that queue in blk_release_queue(). This leaves a window open during which accessing most of the mq debugfs attributes would cause a use-after-free. Additionally, the "state" attribute allows running the queue, which we should not do after the queue has entered the "dead" state. Fix both cases by unregistering the debugfs attributes before freeing queue resources starts. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-26 15:09:04 -06:00
Bart Van Assche	f05d1ba787	blk-mq: Only unregister hctxs for which registration succeeded Hctx unregistration involves calling kobject_del(). kobject_del() must not be called if kobject_add() has not been called. Hence in the error path only unregister hctxs for which registration succeeded. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: Omar Sandoval <osandov@fb.com> Cc: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-26 15:09:04 -06:00
Bart Van Assche	62d6c9496a	blk-mq-debugfs: Rename functions for registering and unregistering the mq directory Since the blk_mq_debugfs_register_hctxs() functions register and unregister all attributes under the "mq" directory, rename these into blk_mq_debugfs_register_mq(). Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-26 15:09:04 -06:00
Bart Van Assche	4c9e4019f1	blk-mq: Let blk_mq_debugfs_register() look up the queue name A later patch will move the call of blk_mq_debugfs_register() to a function to which the queue name is not passed as an argument. To avoid having to add a 'name' argument to multiple callers, let blk_mq_debugfs_register() look up the queue name. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-26 15:09:04 -06:00
Bart Van Assche	2d0364c8c1	blk-mq: Register <dev>/queue/mq after having registered <dev>/queue A later patch in this series will modify blk_mq_debugfs_register() such that it uses q->kobj.parent to determine the name of a request queue. Hence make sure that that pointer is initialized before blk_mq_debugfs_register() is called. To avoid lock inversion, protect sysfs / debugfs registration with the queue sysfs_lock instead of the global mutex all_q_mutex. Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Reviewed-by: Hannes Reinecke <hare@suse.com> Reviewed-by: Omar Sandoval <osandov@fb.com> Signed-off-by: Jens Axboe <axboe@fb.com>	2017-04-26 15:09:04 -06:00
Linus Torvalds	fc08b197bb	Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net Pull networking fixes from David Miller: 1) MLX5 bug fixes from Saeed Mahameed et al: - released wrong resources when firmware timeout happens - fix wrong check for encapsulation size limits - UAR memory leak - ETHTOOL_GRXCLSRLALL failed to fill in info->data 2) Don't cache l3mdev on mis-matches local route, causes net devices to leak refs. From Robert Shearman. 3) Handle fragmented SKBs properly in macsec driver, the problem is that we were mis-sizing the sgvec table. From Jason A. Donenfeld. 4) We cannot have checksum offload enabled for inner UDP tunneled packet during IPSEC, from Ansis Atteka. 5) Fix double SKB free in ravb driver, from Dan Carpenter. 6) Fix CPU port handling in b53 DSA driver, from Florian Dainelli. 7) Don't use on-stack buffers for usb_control_msg() in CAN usb driver, from Maksim Salau. 8) Fix device leak in macvlan driver, from Herbert Xu. We have to purge the broadcast queue properly on port destroy. 9) Fix tx ring entry limit on EF10 devices in sfc driver. From Bert Kenward. 10) Fix memory leaks in team driver, from Pan Bian. 11) Don't setup ipv6_stub before it can be actually used, from Paolo Abeni. 12) Fix tipc socket flow control accounting, from Parthasarathy Bhuvaragan. 13) Fix crash on module unload in hso driver, from Andreas Kemnade. 14) Fix purging of bridge multicast entries, the problem is that if we don't defer it to ndo_uninit it's possible for new entries to get added after we purge. Fix from Xin Long. 15) Don't return garbage for PACKET_HDRLEN getsockopt, from Alexander Potapenko. 16) Fix autoneg stall properly in PHY layer, and revert micrel driver change that was papering over it. From Alexander Kochetkov. 17) Don't dereference an ipv4 route as an ipv6 one in the ip6_tunnnel code, from Cong Wang. 18) Clear out the congestion control private of the TCP socket in all of the right places, from Wei Wang. 19) rawv6_ioctl measures SKB length incorrectly, fix from Jamie Bainbridge. * git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (41 commits) ipv6: check raw payload size correctly in ioctl tcp: memset ca_priv data to 0 properly ipv6: check skb->protocol before lookup for nexthop net: core: Prevent from dereferencing null pointer when releasing SKB macsec: dynamically allocate space for sglist Revert "phy: micrel: Disable auto negotiation on startup" net: phy: fix auto-negotiation stall due to unavailable interrupt net/packet: check length in getsockopt() called with PACKET_HDRLEN net: ipv6: regenerate host route if moved to gc list bridge: move bridge multicast cleanup to ndo_uninit ipv6: fix source routing qed: Fix error in the dcbx app meta data initialization. netvsc: fix calculation of available send sections net: hso: fix module unloading tipc: fix socket flow control accounting error at tipc_recv_stream tipc: fix socket flow control accounting error at tipc_send_stream ipv6: move stub initialization after ipv6 setup completion team: fix memory leaks sfc: tx ring can only have 2048 entries for all EF10 NICs macvlan: Fix device ref leak when purging bc_queue ...	2017-04-26 13:42:32 -07:00
Jamie Bainbridge	105f5528b9	ipv6: check raw payload size correctly in ioctl In situations where an skb is paged, the transport header pointer and tail pointer can be the same because the skb contents are in frags. This results in ioctl(SIOCINQ/FIONREAD) incorrectly returning a length of 0 when the length to receive is actually greater than zero. skb->len is already correctly set in ip6_input_finish() with pskb_pull(), so use skb->len as it always returns the correct result for both linear and paged data. Signed-off-by: Jamie Bainbridge <jbainbri@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:59:35 -04:00
Wei Wang	c120144407	tcp: memset ca_priv data to 0 properly Always zero out ca_priv data in tcp_assign_congestion_control() so that ca_priv data is cleared out during socket creation. Also always zero out ca_priv data in tcp_reinit_congestion_control() so that when cc algorithm is changed, ca_priv data is cleared out as well. We should still zero out ca_priv data even in TCP_CLOSE state because user could call connect() on AF_UNSPEC to disconnect the socket and leave it in TCP_CLOSE state and later call setsockopt() to switch cc algorithm on this socket. Fixes: 2b0a8c9ee ("tcp: add CDG congestion control") Reported-by: Andrey Konovalov <andreyknvl@google.com> Signed-off-by: Wei Wang <weiwan@google.com> Acked-by: Eric Dumazet <edumazet@google.com> Acked-by: Yuchung Cheng <ycheng@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:58:32 -04:00
WANG Cong	199ab00f3c	ipv6: check skb->protocol before lookup for nexthop Andrey reported a out-of-bound access in ip6_tnl_xmit(), this is because we use an ipv4 dst in ip6_tnl_xmit() and cast an IPv4 neigh key as an IPv6 address: neigh = dst_neigh_lookup(skb_dst(skb), &ipv6_hdr(skb)->daddr); if (!neigh) goto tx_err_link_failure; addr6 = (struct in6_addr *)&neigh->primary_key; // <=== HERE addr_type = ipv6_addr_type(addr6); if (addr_type == IPV6_ADDR_ANY) addr6 = &ipv6_hdr(skb)->daddr; memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr)); Also the network header of the skb at this point should be still IPv4 for 4in6 tunnels, we shold not just use it as IPv6 header. This patch fixes it by checking if skb->protocol is ETH_P_IPV6: if it is, we are safe to do the nexthop lookup using skb_dst() and ipv6_hdr(skb)->daddr; if not (aka IPv4), we have no clue about which dest address we can pick here, we have to rely on callers to fill it from tunnel config, so just fall to ip6_route_output() to make the decision. Fixes: ea3dc9601bda ("ip6_tunnel: Add support for wildcard tunnel endpoints.") Reported-by: Andrey Konovalov <andreyknvl@google.com> Tested-by: Andrey Konovalov <andreyknvl@google.com> Cc: Steffen Klassert <steffen.klassert@secunet.com> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:51:26 -04:00
Willem de Bruijn	78a57b482a	virtio-net: on tx, only call napi_disable if tx napi is on As of tx napi, device down (`ip link set dev $dev down`) hangs unless tx napi is enabled. Else napi_enable is not called, so napi_disable will spin on test_and_set_bit NAPI_STATE_SCHED. Only call napi_disable if tx napi is enabled. Fixes: 5a719c2552ca ("virtio-net: transmit napi") Reported-by: Jason Wang <jasowang@redhat.com> Signed-off-by: Willem de Bruijn <willemb@google.com> Acked-by: Michael S. Tsirkin <mst@redhat.com> Acked-by: Jason Wang <jasowang@redhat.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:50:06 -04:00
David S. Miller	1f7fe5d492	Merge branch 'ibmvnic-Move-sub-crq-init-out-of-interrupt-context' Nathan Fontenot says: ==================== ibmvnic: Move sub crq init out of interrupt context The sub crqs are currently intialized in interrupt context when handling a crq response fromn the vios server. There is no reason they must be initialized there. Moving the initialization of the sub crqs to the ibmvnic_init routine allows us to do the initialization outside of interrupt context and make all of the allocations with GFP_KERNEL instead of GFP_ATOMIC. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:49:16 -04:00
Nathan Fontenot	1bb3c739ad	ibmvnic: Move initialization of sub crqs to ibmvnic_init The sub crq structures are initialized in interrupt context while handling the response to crqs when negotiating capabilities for the driver. The sub crqs do not need to be initialized at this point and can be moved to being done from ibmvnic_init. Moving the init of the sub crqs to ibmvnic_init also allows use to allocate the memory with GFP_KERNEL instead of GFP_ATOMIC. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:49:15 -04:00
Nathan Fontenot	d346b9bc4f	ibmvnic: Split initialization of scrqs to its own routine Split the sending of capability request crqs and the initialization of sub crqs into their own routines. This is a first step to moving the allocation of sub-crqs out of interrupt context. Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:49:15 -04:00
Myungho Jung	9899886d5e	net: core: Prevent from dereferencing null pointer when releasing SKB Added NULL check to make __dev_kfree_skb_irq consistent with kfree family of functions. Link: https://bugzilla.kernel.org/show_bug.cgi?id=195289 Signed-off-by: Myungho Jung <mhjungk@gmail.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:47:14 -04:00
Florian Fainelli	4c5e7a2c05	dt-bindings: mdio: Clarify binding document The described GPIO reset property is applicable to all child PHYs. If we have one reset line per PHY present on the MDIO bus, these automatically become properties of the child PHY nodes. Finally, indicate how the RESET pulse width must be defined, which is the maximum value of all individual PHYs RESET pulse widths determined by reading their datasheets. Fixes: 69226896ad63 ("mdio_bus: Issue GPIO RESET to PHYs.") Signed-off-by: Florian Fainelli <f.fainelli@gmail.com> Reviewed-by: Roger Quadros <rogerq@ti.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:45:34 -04:00
David S. Miller	4629498336	Merge branch 'tcp-do-not-use-tcp_time_stamp-for-rcv-autotuning' Eric Dumazet says: ==================== tcp: do not use tcp_time_stamp for rcv autotuning Some devices or linux distributions use HZ=100 or HZ=250 TCP receive buffer autotuning has poor behavior caused by this choice. Since autotuning happens after 4 ms or 10 ms, short distance flows get their receive buffer tuned to a very high value, but after an initial period where it was frozen to (too small) initial value. With BBR (or other CC allowing to increase BDP), we are willing to increase tcp_rmem[2], but this receive autotuning defect is a blocker for hosts dealing with gazillions of TCP flows in the data centers, since many of them have inflated RCVBUF. Risk of OOM is too high. Note that TSO autodefer, tcp cubic, and TCP TS options (RFC 7323) also suffer from our dependency to jiffies (via tcp_time_stamp). We have ongoing efforts to improve all that in the future. ==================== Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:44:39 -04:00
Eric Dumazet	645f4c6f2e	tcp: switch rcv_rtt_est and rcvq_space to high resolution timestamps Some devices or distributions use HZ=100 or HZ=250 TCP receive buffer autotuning has poor behavior caused by this choice. Since autotuning happens after 4 ms or 10 ms, short distance flows get their receive buffer tuned to a very high value, but after an initial period where it was frozen to (too small) initial value. With tp->tcp_mstamp introduction, we can switch to high resolution timestamps almost for free (at the expense of 8 additional bytes per TCP structure) Note that some TCP stacks use usec TCP timestamps where this patch makes even more sense : Many TCP flows have < 500 usec RTT. Hopefully this finer TS option can be standardized soon. Tested: HZ=100 kernel ./netperf -H lpaa24 -t TCP_RR -l 1000 -- -r 10000,10000 & Peer without patch : lpaa24:~# ss -tmi dst lpaa23 ... skmem:(r0,rb8388608,...) rcv_rtt:10 rcv_space:3210000 minrtt:0.017 Peer with the patch : lpaa23:~# ss -tmi dst lpaa24 ... skmem:(r0,rb428800,...) rcv_rtt:0.069 rcv_space:30000 minrtt:0.017 We can see saner RCVBUF, and more precise rcv_rtt information. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:44:39 -04:00
Eric Dumazet	a6db50b81e	tcp: remove ack_time from struct tcp_sacktag_state It is no longer needed, everything uses tp->tcp_mstamp instead. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:44:38 -04:00
Eric Dumazet	7e0ca8a4c1	tcp: use tp->tcp_mstamp in tcp_clean_rtx_queue() Following patch will remove ack_time from struct tcp_sacktag_state Same info is now found in tp->tcp_mstamp Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:44:38 -04:00
Eric Dumazet	d2329f102d	tcp: do not pass timestamp to tcp_rack_advance() No longer needed, since tp->tcp_mstamp holds the information. This is needed to remove sack_state.ack_time in a following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:44:38 -04:00
Eric Dumazet	88d5c65098	tcp: do not pass timestamp to tcp_rate_gen() No longer needed, since tp->tcp_mstamp holds the information. This is needed to remove sack_state.ack_time in a following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:44:38 -04:00
Eric Dumazet	1317a9d69f	tcp: do not pass timestamp to tcp_fastretrans_alert() Not used anymore now tp->tcp_mstamp holds the information. This is needed to remove sack_state.ack_time in a following patch. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:44:38 -04:00
Eric Dumazet	efab8f8582	tcp: do not pass timestamp to tcp_rack_identify_loss() Not used anymore now tp->tcp_mstamp holds the information. Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:44:37 -04:00
Eric Dumazet	128eda86be	tcp: do not pass timestamp to tcp_rack_mark_lost() This is no longer used, since tcp_rack_detect_loss() takes the timestamp from tp->tcp_mstamp Signed-off-by: Eric Dumazet <edumazet@google.com> Acked-by: Soheil Hassas Yeganeh <soheil@google.com> Acked-by: Neal Cardwell <ncardwell@google.com> Signed-off-by: David S. Miller <davem@davemloft.net>	2017-04-26 14:44:37 -04:00

... 6 7 8 9 10 ...

668080 Commits