Drop the global lru lock in isolate callback before calling
zap_page_range which calls cond_resched, and re-acquire the global lru
lock before returning. Also change return code to LRU_REMOVED_RETRY.
Use mmput_async when fail to acquire mmap sem in an atomic context.
Fix "BUG: sleeping function called from invalid context"
errors when CONFIG_DEBUG_ATOMIC_SLEEP is enabled.
Also restore mmput_async, which was initially introduced in commit
ec8d7c14ea ("mm, oom_reaper: do not mmput synchronously from the oom
reaper context"), and was removed in commit 2129258024 ("mm: oom: let
oom_reap_task and exit_mmap run concurrently").
Link: http://lkml.kernel.org/r/20170914182231.90908-1-sherryy@android.com
Fixes: f2517eb76f ("android: binder: Add global lru shrinker to binder")
Signed-off-by: Sherry Yang <sherryy@android.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Reported-by: Kyle Yan <kyan@codeaurora.org>
Acked-by: Arve Hjønnevåg <arve@android.com>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Martijn Coenen <maco@google.com>
Cc: Todd Kjos <tkjos@google.com>
Cc: Riley Andrews <riandrews@android.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Hillf Danton <hdanton@sina.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Hoeun Ryu <hoeun.ryu@gmail.com>
Cc: Christopher Lameter <cl@linux.com>
Cc: Vegard Nossum <vegard.nossum@oracle.com>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Andrea has noticed that the oom_reaper doesn't invalidate the range via
mmu notifiers (mmu_notifier_invalidate_range_start/end) and that can
corrupt the memory of the kvm guest for example.
tlb_flush_mmu_tlbonly already invokes mmu notifiers but that is not
sufficient as per Andrea:
"mmu_notifier_invalidate_range cannot be used in replacement of
mmu_notifier_invalidate_range_start/end. For KVM
mmu_notifier_invalidate_range is a noop and rightfully so. A MMU
notifier implementation has to implement either ->invalidate_range
method or the invalidate_range_start/end methods, not both. And if you
implement invalidate_range_start/end like KVM is forced to do, calling
mmu_notifier_invalidate_range in common code is a noop for KVM.
For those MMU notifiers that can get away only implementing
->invalidate_range, the ->invalidate_range is implicitly called by
mmu_notifier_invalidate_range_end(). And only those secondary MMUs
that share the same pagetable with the primary MMU (like AMD iommuv2)
can get away only implementing ->invalidate_range"
As the callback is allowed to sleep and the implementation is out of
hand of the MM it is safer to simply bail out if there is an mmu
notifier registered. In order to not fail too early make the
mm_has_notifiers check under the oom_lock and have a little nap before
failing to give the current oom victim some more time to exit.
[akpm@linux-foundation.org: coding-style fixes]
Link: http://lkml.kernel.org/r/20170913113427.2291-1-mhocko@kernel.org
Fixes: aac4536355 ("mm, oom: introduce oom reaper")
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reported-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
There's a typo in recent change of VM_MPX definition. We want it to be
VM_HIGH_ARCH_4, not VM_HIGH_ARCH_BIT_4.
This bug does cause visible regressions. In arch_vma_name the vmflags
are tested against VM_MPX. With the incorrect value of VM_MPX, a number
of vmas (such as the stack) test positive and end up being marked as
"[mpx]" in /proc/N/maps instead of their correct names.
This confuses tools like rr which expect to be able to find familiar
vmas.
Fixes: df3735c5b4 ("x86,mpx: make mpx depend on x86-64 to free up VMA flag")
Link: http://lkml.kernel.org/r/20170918140253.36856-1-kirill.shutemov@linux.intel.com
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: Rik van Riel <riel@redhat.com>
Cc: Dave Hansen <dave.hansen@intel.com>
Cc: Kyle Huey <me@kylehuey.com>
Cc: <stable@vger.kernel.org> [4.14+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Pull percpu fixes from Tejun Heo:
"Rather important fixes this time.
- The new percpu area allocator had a subtle bug in how it iterates
the memory regions and could skip viable areas, which led to
allocation failures for module static percpu variables. Dennis
fixed the bug and another non-critical one in stat calculation.
- Mark noticed that the generic implementations of percpu local
atomic reads aren't properly protected against irqs and there's a
(slim) chance for split reads on some 32bit systems. Generic
implementations are updated to disable irq when read size is larger
than ulong size. This may have made some 32bit archs which can do
atomic local 64bit accesses generate sub-optimal code. We need to
find them out and implement arch-specific overrides"
* 'for-4.14-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu:
percpu: fix iteration to prevent skipping over block
percpu: fix starting offset for chunk statistics traversal
percpu: make this_cpu_generic_read() atomic w.r.t. interrupts
Here are a number of USB fixes for 4.14-rc4 to resolved reported issue.
There's a bunch of stuff in here based on the great work Andrey
Konovalov is doing in fuzzing the USB stack. Lots of bug fixes when
dealing with corrupted USB descriptors that we've never seen in "normal"
operation, but is now ensuring the stack is much more hardened overall.
There's also the usual XHCI and gadget driver fixes as well, and a build
error fix, and a few other minor things, full details in the shortlog.
All of these have been in linux-next with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWdN/yw8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yl6pQCdGY+nPJhzj9EIeFj5QUpSuS4b1pYAoKrbNn+V
CMpg4iG1oXUtVL8jBbKa
=fVpl
-----END PGP SIGNATURE-----
Merge tag 'usb-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb
Pull USB fixes from Greg KH:
"Here are a number of USB fixes for 4.14-rc4 to resolved reported
issues.
There's a bunch of stuff in here based on the great work Andrey
Konovalov is doing in fuzzing the USB stack. Lots of bug fixes when
dealing with corrupted USB descriptors that we've never seen in
"normal" operation, but is now ensuring the stack is much more
hardened overall.
There's also the usual XHCI and gadget driver fixes as well, and a
build error fix, and a few other minor things, full details in the
shortlog.
All of these have been in linux-next with no reported issues"
* tag 'usb-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb: (38 commits)
usb: dwc3: of-simple: Add compatible for Spreadtrum SC9860 platform
usb: gadget: udc: atmel: set vbus irqflags explicitly
usb: gadget: ffs: handle I/O completion in-order
usb: renesas_usbhs: fix usbhsf_fifo_clear() for RX direction
usb: renesas_usbhs: fix the BCLR setting condition for non-DCP pipe
usb: gadget: udc: renesas_usb3: Fix return value of usb3_write_pipe()
usb: gadget: udc: renesas_usb3: fix Pn_RAMMAP.Pn_MPKT value
usb: gadget: udc: renesas_usb3: fix for no-data control transfer
USB: dummy-hcd: Fix erroneous synchronization change
USB: dummy-hcd: fix infinite-loop resubmission bug
USB: dummy-hcd: fix connection failures (wrong speed)
USB: cdc-wdm: ignore -EPIPE from GetEncapsulatedResponse
USB: devio: Don't corrupt user memory
USB: devio: Prevent integer overflow in proc_do_submiturb()
USB: g_mass_storage: Fix deadlock when driver is unbound
USB: gadgetfs: Fix crash caused by inadequate synchronization
USB: gadgetfs: fix copy_to_user while holding spinlock
USB: uas: fix bug in handling of alternate settings
usb-storage: unusual_devs entry to fix write-access regression for Seagate external drives
usb-storage: fix bogus hardware error messages for ATA pass-thru devices
...
Here are some small staging/IIO driver fixes for 4.14-rc4
Most of these have been in my tree for a while due to travels, sorry for
the delay. They resolve a number of small issues reported by people,
mostly for the iio drivers. Nothing major in here, full details are in
the shortlog.
All have been linux-next for a few weeks with no reported issues.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWdN+KQ8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+yluygCgneh7i/okOfsmt/p75eCA4ClWVLwAoIE7BZzt
1WdBcY/Zxv1ANIoY7ZTQ
=K+FX
-----END PGP SIGNATURE-----
Merge tag 'staging-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging
Pull staging/IIO fixes from Greg KH:
"Here are some small staging/IIO driver fixes for 4.14-rc4
Most of these have been in my tree for a while due to travels, sorry
for the delay. They resolve a number of small issues reported by
people, mostly for the iio drivers. Nothing major in here, full
details are in the shortlog.
All have been linux-next for a few weeks with no reported issues"
* tag 'staging-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging: (23 commits)
staging: iio: ad7192: Fix - use the dedicated reset function avoiding dma from stack.
iio: core: Return error for failed read_reg
iio: ad7793: Fix the serial interface reset
iio: ad_sigma_delta: Implement a dedicated reset function
IIO: BME280: Updates to Humidity readings need ctrl_reg write!
iio: adc: mcp320x: Fix readout of negative voltages
iio: adc: mcp320x: Fix oops on module unload
iio: adc: stm32: fix bad error check on max_channels
iio: trigger: stm32-timer: fix a corner case to write preset
iio: trigger: stm32-timer: preset shouldn't be buffered
iio: adc: twl4030: Return an error if we can not enable the vusb3v1 regulator in 'twl4030_madc_probe()'
iio: adc: twl4030: Disable the vusb3v1 rugulator in the error handling path of 'twl4030_madc_probe()'
iio: adc: twl4030: Fix an error handling path in 'twl4030_madc_probe()'
staging: rtl8723bs: avoid null pointer dereference on pmlmepriv
staging: rtl8723bs: add missing range check on id
staging: vchiq_2835_arm: Fix NULL ptr dereference in free_pagelist
staging: speakup: fix speakup-r empty line lockup
staging: pi433: Move limit check to switch default to kill warning
staging: r8822be: fix null pointer dereferences with a null driver_adapter
staging: mt29f_spinand: Enable the read ECC before program the page
...
Here are a few small fixes for 4.14-rc4.
The removal of DRIVER_ATTR() was almost completed by 4.14-rc1, but one
straggler made it in through some other tree (odds are, one of mine...)
So there's a simple removal of the last user, and then finally the macro
is removed from the tree.
There's a fix for old crazy udev instances that insist on reloading a
module when it is removed from the kernel due to the new uevents for
bind/unbind. This fixes the reported regression, hopefully some year in
the future we can drop the workaround, once users update to the latest
version, but I'm not holding my breath.
And then there's a build fix for a linker warning, and a buffer overflow
fix to match the PCI fixes you took through the PCI tree in the same
area.
All of these have been in linux-next for a few weeks while I've been
traveling, sorry for the delay.
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
-----BEGIN PGP SIGNATURE-----
iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCWdN8qA8cZ3JlZ0Brcm9h
aC5jb20ACgkQMUfUDdst+ymLEgCfUSSBhxW04teEcPua4QygLv2omK0An2SRkpnY
28nn+D+AfeOByQImY8v+
=RQY+
-----END PGP SIGNATURE-----
Merge tag 'driver-core-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core
Pull driver core fixes from Greg KH:
"Here are a few small fixes for 4.14-rc4.
The removal of DRIVER_ATTR() was almost completed by 4.14-rc1, but one
straggler made it in through some other tree (odds are, one of
mine...) So there's a simple removal of the last user, and then
finally the macro is removed from the tree.
There's a fix for old crazy udev instances that insist on reloading a
module when it is removed from the kernel due to the new uevents for
bind/unbind. This fixes the reported regression, hopefully some year
in the future we can drop the workaround, once users update to the
latest version, but I'm not holding my breath.
And then there's a build fix for a linker warning, and a buffer
overflow fix to match the PCI fixes you took through the PCI tree in
the same area.
All of these have been in linux-next for a few weeks while I've been
traveling, sorry for the delay"
* tag 'driver-core-4.14-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core:
driver core: remove DRIVER_ATTR
fpga: altera-cvp: remove DRIVER_ATTR() usage
driver core: platform: Don't read past the end of "driver_override" buffer
base: arch_topology: fix section mismatch build warnings
driver core: suppress sending MODALIAS in UNBIND uevents
Pull timer fixes from Thomas Gleixner:
"This adds a new timer wheel function which is required for the
conversion of the timer callback function from the 'unsigned long
data' argument to 'struct timer_list *timer'. This conversion has two
benefits:
1) It makes struct timer_list smaller
2) Many callers hand in a pointer to the timer or to the structure
containing the timer, which happens via type casting both at setup
and in the callback. This change gets rid of the typecasts.
Once the conversion is complete, which is planned for 4.15, the old
setup function and the intermediate typecast in the new setup function
go away along with the data field in struct timer_list.
Merging this now into mainline allows a smooth queueing of the actual
conversion in the affected maintainer trees without creating
dependencies"
* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
um/time: Fixup namespace collision
timer: Prepare to change timer callback argument type
Pull smp/hotplug fixes from Thomas Gleixner:
"This addresses the fallout of the new lockdep mechanism which covers
completions in the CPU hotplug code.
The lockdep splats are false positives, but there is no way to
annotate that reliably. The solution is to split the completions for
CPU up and down, which requires some reshuffling of the failure
rollback handling as well"
* 'smp-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
smp/hotplug: Hotplug state fail injection
smp/hotplug: Differentiate the AP completion between up and down
smp/hotplug: Differentiate the AP-work lockdep class between up and down
smp/hotplug: Callback vs state-machine consistency
smp/hotplug: Rewrite AP state machine core
smp/hotplug: Allow external multi-instance rollback
smp/hotplug: Add state diagram
Pull scheduler fixes from Thomas Gleixner:
"The scheduler pull request comes with the following updates:
- Prevent a divide by zero issue by validating the input value of
sysctl_sched_time_avg
- Make task state printing consistent all over the place and have
explicit state characters for IDLE and PARKED so they wont be
displayed as 'D' state which confuses tools"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
sched/sysctl: Check user input value of sysctl_sched_time_avg
sched/debug: Add explicit TASK_PARKED printing
sched/debug: Ignore TASK_IDLE for SysRq-W
sched/debug: Add explicit TASK_IDLE printing
sched/tracing: Use common task-state helpers
sched/tracing: Fix trace_sched_switch task-state printing
sched/debug: Remove unused variable
sched/debug: Convert TASK_state to hex
sched/debug: Implement consistent task-state printing
Including:
* A comment fix for 'struct iommu_ops'
* Format string fixes for AMD IOMMU, unfortunatly I missed that
during review.
* Limit mediatek physical addresses to 32 bit for v7s to fix a
warning triggered in io-page-table code.
* Fix dma-sync in io-pgtable-arm-v7s code
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABAgAGBQJZzk7lAAoJECvwRC2XARrjlIgP+gPzbUh41Lazpb1NtA8Qxlhn
GgIUmX4DBOVUvm5SXf8mnZ51JRiR31fEem5i6PGxRPAjukC0w18cemR+l9AtYXt/
4LrFDLGS75iDmLXKNSwaLTIr+LG9fEFj8YfrjRHh0eFLC8PDdWAtFP2pPHe/+NzO
b0YVF260JUOl4mK6gBl69NqNr06e1y0J5dpw04uynymgw6jwUGMxoCPjuvBbWYvk
Hhg8khZSd78Dr+vqV3kKfI2M41tGhf+m/Ozr8W7ZlkUpG1GLRXAjQ1D4k2SdGrzA
H5q1bowquh8pESYpqt4HfpZ9CDDM+D7SKqNMHgAKcPiXletOVokds9k0MTXOWqtA
Vpo7VWyELNt61A13Ev6q0+1ybfS4wkN5nnB6sxsF4WuBGgf+qQar79zQMMgNbaQX
ftIr6sTfe6Oe4cATtDrkmzYA9hqOwSgmM7PAybkp4Z/eCaWAc5mgZizhEy1/rVMi
AwEY/DFkclg6h1ZvKq+OVMey1sFkTMmCUsv7qX9I7Sml8gex6DRN94BPIze40/ZD
FP/UErITfOuIH9hLiHrm6tbq4XWsfBI00DJrXKTGwvGlM3JnMVFv4kaNyKDcIKH3
4fcZavGux8V4UGTaj0lqblQcTvH/+IbgBU8G5JXnujHED2jVY6BThOMr/H/DHtND
KxJE+TW/YCw/ARyCpwIv
=USO1
-----END PGP SIGNATURE-----
Merge tag 'iommu-fixes-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu
Pull IOMMU fixes from Joerg Roedel:
- A comment fix for 'struct iommu_ops'
- Format string fixes for AMD IOMMU, unfortunatly I missed that during
review.
- Limit mediatek physical addresses to 32 bit for v7s to fix a warning
triggered in io-page-table code.
- Fix dma-sync in io-pgtable-arm-v7s code
* tag 'iommu-fixes-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/joro/iommu:
iommu: Fix comment for iommu_ops.map_sg
iommu/amd: pr_err() strings should end with newlines
iommu/mediatek: Limit the physical address in 32bit for v7s
iommu/io-pgtable-arm-v7s: Need dma-sync while there is no QUIRK_NO_DMA
Pull keys fixes from James Morris:
"Notable here is a rewrite of big_key crypto by Jason Donenfeld to
address some issues in the original code.
From Jason's commit log:
"This started out as just replacing the use of crypto/rng with
get_random_bytes_wait, so that we wouldn't use bad randomness at
boot time. But, upon looking further, it appears that there were
even deeper underlying cryptographic problems, and that this seems
to have been committed with very little crypto review. So, I rewrote
the whole thing, trying to keep to the conventions introduced by the
previous author, to fix these cryptographic flaws."
There has been positive review of the new code by Eric Biggers and
Herbert Xu, and it passes basic testing via the keyutils test suite.
Eric also manually tested it.
Generally speaking, we likely need to improve the amount of crypto
review for kernel crypto users including keys (I'll post a note
separately to ksummit-discuss)"
* 'fixes-v4.14-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
security/keys: rewrite all of big_key crypto
security/keys: properly zero out sensitive key material in big_key
KEYS: use kmemdup() in request_key_auth_new()
KEYS: restrict /proc/keys by credentials at open time
KEYS: reset parent each time before searching key_user_tree
KEYS: prevent KEYCTL_READ on negative key
KEYS: prevent creating a different user's keyrings
KEYS: fix writing past end of user-supplied buffer in keyring_read()
KEYS: fix key refcount leak in keyctl_read_key()
KEYS: fix key refcount leak in keyctl_assume_authority()
KEYS: don't revoke uninstantiated key in request_key_auth_new()
KEYS: fix cred refcount leak in request_key_auth_new()
Currently TASK_PARKED is masqueraded as TASK_INTERRUPTIBLE, give it
its own print state because it will not in fact get woken by regular
wakeups and is a long-term state.
This requires moving TASK_PARKED into the TASK_REPORT mask, and since
that latter needs to be a contiguous bitmask, we need to shuffle the
bits around a bit.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Markus reported that kthreads that idle using TASK_IDLE instead of
TASK_INTERRUPTIBLE are reported in as TASK_UNINTERRUPTIBLE and things
like htop mark those red.
This is undesirable, so add an explicit state for TASK_IDLE.
Reported-by: Markus Trippelsdorf <markus@trippelsdorf.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Convert trace_sched_switch to use the common task-state helpers and
fix the "X" and "Z" order, possibly they ended up in the wrong order
because TASK_REPORT has them in the wrong order too.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Bit patterns are easier in hex.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Currently get_task_state() and task_state_to_char() report different
states, create a number of common helpers and unify the reported state
space.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar <mingo@kernel.org>
- a few core fixes
- a few ipoib fixes
- a few mlx5 fixes
- a 7 patch hfi1 related series
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJZy8HKAAoJELgmozMOVy/d/3YP/RtJ4I+7dlHAdTrUsLkNIXzj
6e2sc5A7JQRvhbWa6ZfqkbD4DBz2gkz9bXmlYotP1nVfunBie9xQPi+nN39YNnTv
VPYa0G7RD53APw71ETCGh0uBBAjc8lGm0AOPj+HpSP7PvrLdH6B68IcAeXCSOf8D
orzXI0bRpRnLsW4IJ0zN09zShigYuCJVl0Wf59QB0Wrbw4veQD4W7bLSCAUTmuZk
TPb8bPlXY64Bf731HRftxIRl3HwUrpTPv5DuHcASAbVL/KeucWpPmOAj9XqhXTQp
tnqtiwBWYDcsLBwS/IS40B2gfN1BCh6hn03pSVbPj+HD/FLY7x8Gf/Lu0qQNmklz
9nvgMKHL/2h+T4M7DulhS7DTP58bvtkyKG+j77gjEmKX1OI0NXHOntKZDSjGAT2J
zw2dNx4Y/Sgng1HBCbHAAHMrFUdyj7XpQNR8mzdGvDcwtRfrDKmchGtvhVclPsbl
R3U9GN2NcAwg2+bIN96hTzUMB10QOZdvddGFvbxuB7FaWkskPaN52O1ptT3+MyWt
xccZp0iYu40zV80mEm+nF/kZwR8omfE6xM1ujQdIhMHstGe+z29BhqsaQ8Zw1qEG
oaU7+9m2aK57SvcSimR2S4kdK7Gxw9+BIVKdRREJwe9xvWVf96OvJnhnh5t5Fs56
BTN1mBn+7LxlK9eDVler
=HbhA
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma
Pull rdma fixes from Doug Ledford:
"Second -rc update for 4.14.
Both Mellanox and Intel had a series of -rc fixes that landed this
week. The Mellanox bunch is spread throughout the stack and not just
in their driver, where as the Intel bunch was mostly in the hfi1
driver. And, several of the fixes in the hfi1 driver were more than
just simple 5 line fixes. As a result, the hfi1 driver fixes has a
sizable LOC count.
Everything else is as one would expect in an RC cycle in terms of LOC
count. One item that might jump out and make you think "That's not an
rc item" is the fix that corrects a typo. But, that change fixes a
typo in a user visible API that was just added in this merge window,
so if we fix it now, we can fix it. If we don't, the typo is in the
API forever. Another that might not appear to be a fix at first glance
is the Simplify mlx5_ib_cont_pages patch, but the simplification
allows them to fix a bug in the existing function whenever the length
of an SGE exceeded page size. We also had to revert one patch from the
merge window that was wrong.
Summary:
- a few core fixes
- a few ipoib fixes
- a few mlx5 fixes
- a 7-patch hfi1 related series"
* tag 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dledford/rdma:
IB/hfi1: Unsuccessful PCIe caps tuning should not fail driver load
IB/hfi1: On error, fix use after free during user context setup
Revert "IB/ipoib: Update broadcast object if PKey value was changed in index 0"
IB/hfi1: Return correct value in general interrupt handler
IB/hfi1: Check eeprom config partition validity
IB/hfi1: Only reset QSFP after link up and turn off AOC TX
IB/hfi1: Turn off AOC TX after offline substates
IB/mlx5: Fix NULL deference on mlx5_ib_update_xlt failure
IB/mlx5: Simplify mlx5_ib_cont_pages
IB/ipoib: Fix inconsistency with free_netdev and free_rdma_netdev
IB/ipoib: Fix sysfs Pkey create<->remove possible deadlock
IB: Correct MR length field to be 64-bit
IB/core: Fix qp_sec use after free access
IB/core: Fix typo in the name of the tag-matching cap struct
Modern kernel callback systems pass the structure associated with a
given callback to the callback function. The timer callback remains one
of the legacy cases where an arbitrary unsigned long argument continues
to be passed as the callback argument. This has several problems:
- This bloats the timer_list structure with a normally redundant
.data field.
- No type checking is being performed, forcing callbacks to do
explicit type casts of the unsigned long argument into the object
that was passed, rather than using container_of(), as done in most
of the other callback infrastructure.
- Neighboring buffer overflows can overwrite both the .function and
the .data field, providing attackers with a way to elevate from a buffer
overflow into a simplistic ROP-like mechanism that allows calling
arbitrary functions with a controlled first argument.
- For future Control Flow Integrity work, this creates a unique function
prototype for timer callbacks, instead of allowing them to continue to
be clustered with other void functions that take a single unsigned long
argument.
This adds a new timer initialization API, which will ultimately replace
the existing setup_timer(), setup_{deferrable,pinned,etc}_timer() family,
named timer_setup() (to mirror hrtimer_setup(), making instances of its
use much easier to grep for).
In order to support the migration of existing timers into the new
callback arguments, timer_setup() casts its arguments to the existing
legacy types, and explicitly passes the timer pointer as the legacy
data argument. Once all setup_*timer() callers have been replaced with
timer_setup(), the casts can be removed, and the data argument can be
dropped with the timer expiration code changed to just pass the timer
to the callback directly.
Since the regular pattern of using container_of() during local variable
declaration repeats the need for the variable type declaration
to be included, this adds a helper modeled after other from_*()
helpers that wrap container_of(), named from_timer(). This helper uses
typeof(*variable), removing the type redundancy and minimizing the need
for line wraps in forthcoming conversions from "unsigned data long" to
"struct timer_list *" in the timer callbacks:
-void callback(unsigned long data)
+void callback(struct timer_list *t)
{
- struct some_data_structure *local = (struct some_data_structure *)data;
+ struct some_data_structure *local = from_timer(local, t, timer);
Finally, in order to support the handful of timer users that perform
open-coded assignments of the .function (and .data) fields, provide
cast macros (TIMER_FUNC_TYPE and TIMER_DATA_TYPE) that can be used
temporarily. Once conversion has been completed, these can be globally
trivially removed.
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20170928133817.GA113410@beast
From David Howells:
"There are two sets of patches here:
(1) A bunch of core keyrings bug fixes from Eric Biggers.
(2) Fixing big_key to use safe crypto from Jason A. Donenfeld."
The definition of map_sg was split during a recent addition to iommu_ops.
Put it back together.
Fixes: add02cfdc9 ("iommu: Introduce Interface for IOMMU TLB Flushing")
Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
Signed-off-by: Joerg Roedel <jroedel@suse.de>
As raw_cpu_generic_read() is a plain read from a raw_cpu_ptr() address,
it's possible (albeit unlikely) that the compiler will split the access
across multiple instructions.
In this_cpu_generic_read() we disable preemption but not interrupts
before calling raw_cpu_generic_read(). Thus, an interrupt could be taken
in the middle of the split load instructions. If a this_cpu_write() or
RMW this_cpu_*() op is made to the same variable in the interrupt
handling path, this_cpu_read() will return a torn value.
For native word types, we can avoid tearing using READ_ONCE(), but this
won't work in all cases (e.g. 64-bit types on most 32-bit platforms).
This patch reworks this_cpu_generic_read() to use READ_ONCE() where
possible, otherwise falling back to disabling interrupts.
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Christoph Lameter <cl@linux.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Pranith Kumar <bobby.prani@gmail.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-arch@vger.kernel.org
Cc: stable@vger.kernel.org
Signed-off-by: Tejun Heo <tj@kernel.org>
Pull block fixes from Jens Axboe:
- Two sets of NVMe pull requests from Christoph:
- Fixes for the Fibre Channel host/target to fix spec compliance
- Allow a zero keep alive timeout
- Make the debug printk for broken SGLs work better
- Fix queue zeroing during initialization
- Set of RDMA and FC fixes
- Target div-by-zero fix
- bsg double-free fix.
- ndb unknown ioctl fix from Josef.
- Buffered vs O_DIRECT page cache inconsistency fix. Has been floating
around for a long time, well reviewed. From Lukas.
- brd overflow fix from Mikulas.
- Fix for a loop regression in this merge window, where using a union
for two members of the loop_cmd turned out to be a really bad idea.
From Omar.
- Fix for an iostat regression fix in this series, using the wrong API
to get at the block queue. From Shaohua.
- Fix for a potential blktrace delection deadlock. From Waiman.
* 'for-linus' of git://git.kernel.dk/linux-block: (30 commits)
nvme-fcloop: fix port deletes and callbacks
nvmet-fc: sync header templates with comments
nvmet-fc: ensure target queue id within range.
nvmet-fc: on port remove call put outside lock
nvme-rdma: don't fully stop the controller in error recovery
nvme-rdma: give up reconnect if state change fails
nvme-core: Use nvme_wq to queue async events and fw activation
nvme: fix sqhd reference when admin queue connect fails
block: fix a crash caused by wrong API
fs: Fix page cache inconsistency when mixing buffered and AIO DIO
nvmet: implement valid sqhd values in completions
nvme-fabrics: Allow 0 as KATO value
nvme: allow timed-out ios to retry
nvme: stop aer posting if controller state not live
nvme-pci: Print invalid SGL only once
nvme-pci: initialize queue memory before interrupts
nvmet-fc: fix failing max io queue connections
nvme-fc: use transport-specific sgl format
nvme: add transport SGL definitions
nvme.h: remove FC transport-specific error values
...
Add a sysfs file to one-time fail a specific state. This can be used
to test the state rollback code paths.
Something like this (hotplug-up.sh):
#!/bin/bash
echo 0 > /debug/sched_debug
echo 1 > /debug/tracing/events/cpuhp/enable
ALL_STATES=`cat /sys/devices/system/cpu/hotplug/states | cut -d':' -f1`
STATES=${1:-$ALL_STATES}
for state in $STATES
do
echo 0 > /sys/devices/system/cpu/cpu1/online
echo 0 > /debug/tracing/trace
echo Fail state: $state
echo $state > /sys/devices/system/cpu/cpu1/hotplug/fail
cat /sys/devices/system/cpu/cpu1/hotplug/fail
echo 1 > /sys/devices/system/cpu/cpu1/online
cat /debug/tracing/trace > hotfail-${state}.trace
sleep 1
done
Can be used to test for all possible rollback (barring multi-instance)
scenarios on CPU-up, CPU-down is a trivial modification of the above.
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.972581715@infradead.org
Comments were incorrect:
- defer_rcv was in host port template. moved to target port template
- Added Mandatory statements for target port template items
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
If CONFIG_PCI=n and gcc (e.g. 4.1.2) decides not to inline
get_pci_function_alias_group(), the build fails with:
drivers/iommu/iommu.o: In function `get_pci_function_alias_group':
iommu.c:(.text+0xfdc): undefined reference to `pci_acs_enabled'
Due to the various dummies for PCI calls in the CONFIG_PCI=n case,
pci_acs_enabled() never called, but not all versions of gcc are smart
enough to realize that.
While explicitly marking get_pci_function_alias_group() inline would fix
the build, this would inflate the code for the CONFIG_PCI=y case, as
get_pci_function_alias_group() is a not-so-small function called from two
places.
Hence fix the issue by introducing a dummy for pci_acs_enabled() instead.
Fixes: 0ae349a0f3 ("iommu/qcom: Add qcom_iommu")
Signed-off-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
The ib_mr->length represents the length of the MR in bytes as per
the IBTA spec 1.3 section 11.2.10.3 (REGISTER PHYSICAL MEMORY REGION).
Currently ib_mr->length field is defined as only 32-bits field.
This might result into truncation and failed WRs of consumers who
registers more than 4GB bytes memory regions and whose WRs accessing
such MRs.
This patch makes the length 64-bit to avoid such truncation.
Cc: Sagi Grimberg <sagi@grimberg.me>
Cc: Chuck Lever <chuck.lever@oracle.com>
Cc: Faisal Latif <faisal.latif@intel.com>
Fixes: 4c67e2bfc8 ("IB/core: Introduce new fast registration API")
Signed-off-by: Ilya Lesokhin <ilyal@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
Signed-off-by: Doug Ledford <dledford@redhat.com>
The tag matching functionality is implemented by mlx5 driver
by extending XRQ, however this internal kernel information was
exposed to user space applications with *xrq* name instead of *tm*.
This patch renames *xrq* to *tm* to handle that.
Fixes: 8d50505ada ("IB/uverbs: Expose XRQ capabilities")
Signed-off-by: Leon Romanovsky <leonro@mellanox.com>
Reviewed-by: Yishai Hadas <yishaih@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
Add transport SGL defintions from NVMe TP 4008, required for
the final NVMe-FC standard.
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The NVM express group recinded the reserved range for the transport.
Remove the FC-centric values that had been defined.
Signed-off-by: James Smart <james.smart@broadcom.com>
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
The lockdep code had reported the following unsafe locking scenario:
CPU0 CPU1
---- ----
lock(s_active#228);
lock(&bdev->bd_mutex/1);
lock(s_active#228);
lock(&bdev->bd_mutex);
*** DEADLOCK ***
The deadlock may happen when one task (CPU1) is trying to delete a
partition in a block device and another task (CPU0) is accessing
tracing sysfs file (e.g. /sys/block/dm-1/trace/act_mask) in that
partition.
The s_active isn't an actual lock. It is a reference count (kn->count)
on the sysfs (kernfs) file. Removal of a sysfs file, however, require
a wait until all the references are gone. The reference count is
treated like a rwsem using lockdep instrumentation code.
The fact that a thread is in the sysfs callback method or in the
ioctl call means there is a reference to the opended sysfs or device
file. That should prevent the underlying block structure from being
removed.
Instead of using bd_mutex in the block_device structure, a new
blk_trace_mutex is now added to the request_queue structure to protect
access to the blk_trace structure.
Suggested-by: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Fix typo in patch subject line, and prune a comment detailing how
the code used to work.
Signed-off-by: Jens Axboe <axboe@kernel.dk>
It was possible for an unprivileged user to create the user and user
session keyrings for another user. For example:
sudo -u '#3000' sh -c 'keyctl add keyring _uid.4000 "" @u
keyctl add keyring _uid_ses.4000 "" @u
sleep 15' &
sleep 1
sudo -u '#4000' keyctl describe @u
sudo -u '#4000' keyctl describe @us
This is problematic because these "fake" keyrings won't have the right
permissions. In particular, the user who created them first will own
them and will have full access to them via the possessor permissions,
which can be used to compromise the security of a user's keys:
-4: alswrv-----v------------ 3000 0 keyring: _uid.4000
-5: alswrv-----v------------ 3000 0 keyring: _uid_ses.4000
Fix it by marking user and user session keyrings with a flag
KEY_FLAG_UID_KEYRING. Then, when searching for a user or user session
keyring by name, skip all keyrings that don't have the flag set.
Fixes: 69664cf16a ("keys: don't generate user and user session keyrings unless they're accessed")
Cc: <stable@vger.kernel.org> [v2.6.26+]
Signed-off-by: Eric Biggers <ebiggers@google.com>
Signed-off-by: David Howells <dhowells@redhat.com>
Note this includes fixes from recent merge window. As such the tree
is based on top of a prior staging/staging-next tree.
* iio core
- return and error for a failed read_reg debugfs call rather than
eating the error.
* ad7192
- Use the dedicated reset function in the ad_sigma_delta library
instead of an spi transfer with the data on the stack which
could cause problems with DMA.
* ad7793
- Implement a dedicate reset function in the ad_sigma_delta library
and use it to correctly reset this part.
* bme280
- ctrl_reg write must occur after any register writes
for updates to take effect.
* mcp320x
- negative voltage readout was broken.
- Fix an oops on module unload due to spi_set_drvdata not being called
in probe.
* st_magn
- Fix the data ready line configuration for the lis3mdl. It is not
configurable so the st_magn core was assuming it didn't exist
and so wasn't consuming interrupts resulting in an unhandled
interrupt.
* stm32-adc
- off by one error on max channels checking.
* stm32-timer
- preset should not be buffered - reorganising register writes avoids
this.
- fix a corner case in which write preset goes wrong when a timer is
used first as a trigger then as a counter with preset. Odd case but
you never know.
* ti-ads1015
- Fix setting of comparator polarity by fixing bitfield definition.
* twl4030
- Error path handling fix to cleanup in event of regulator
registration failure.
- Disable the vusb3v1 regulator correctly in error handling
- Don't paper over a regulator enable failure.
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEEbilms4eEBlKRJoGxVIU0mcT0FogFAlnH2PkRHGppYzIzQGtl
cm5lbC5vcmcACgkQVIU0mcT0FoiNFw//XKT/CRq7R8Dg5vy4FT7NRw/SpSVH9u8M
JsIISyAoh0jfOFJFkVv2Ll+Zs+SkbXJOPMUreaFJMnftSCkBdteT9D5kz5cYiUkO
sFn58YEEqKCqQVJBEi0yM00HMxbpNKiNHXzo9WC8ZiUVRJqSigq1coznSqaZngRd
yf4a9Z33jiH/NXalqy5Vb4CSn76UWpvvYJ11tuNxXqOldkyU4sm8bvsPWnBg1qG3
YIuVZs83Rv1HW0t6ktYf3WwC2rBhKEM8st2fUCLkU+4+foDuqQBiFdWJzTcxe6Ru
8za+q3ngOo7rRkLS8cSTsh8hbfKwIV0e3rQvsN3zqodqjPilw9uH3eybMYzLK/0I
j9o12jaD1Ow7zQfiQnNnyJqDLddEt+f6GcrI/iYQPlTIQIhqDii4YInDyCrF4xbr
i/8saOr5n4PsdHRovTUl2rrgiX+Jix6OmNLEtUdp1HWrc0XwbZ5L/1g6pXFL98QZ
Kq8MwjNu+6x+66Hga/1fll19hZ/VMYGqxFtoP+6S28TS1bO9dXm6XNl5/U5Nh5mN
J255UFCdFkBmJO9nP7tmbDscMCAbhc6AOHKxjUebuoNnyhyV+20W9S/gpzTIilnO
Gv9D0z29zITc8Hpfhl2YKC6F00BuRAdPFJo05rSE/KoWX3bkPCiXFTfQ+aPwB72h
bYT4oJeUoQg=
=CcjQ
-----END PGP SIGNATURE-----
Merge tag 'iio-fixes-for-4.14a' of git://git.kernel.org/pub/scm/linux/kernel/git/jic23/iio into staging-linus
Jonathan writes:
First round of IIO fixes for the 4.14 cycle
Note this includes fixes from recent merge window. As such the tree
is based on top of a prior staging/staging-next tree.
* iio core
- return and error for a failed read_reg debugfs call rather than
eating the error.
* ad7192
- Use the dedicated reset function in the ad_sigma_delta library
instead of an spi transfer with the data on the stack which
could cause problems with DMA.
* ad7793
- Implement a dedicate reset function in the ad_sigma_delta library
and use it to correctly reset this part.
* bme280
- ctrl_reg write must occur after any register writes
for updates to take effect.
* mcp320x
- negative voltage readout was broken.
- Fix an oops on module unload due to spi_set_drvdata not being called
in probe.
* st_magn
- Fix the data ready line configuration for the lis3mdl. It is not
configurable so the st_magn core was assuming it didn't exist
and so wasn't consuming interrupts resulting in an unhandled
interrupt.
* stm32-adc
- off by one error on max channels checking.
* stm32-timer
- preset should not be buffered - reorganising register writes avoids
this.
- fix a corner case in which write preset goes wrong when a timer is
used first as a trigger then as a counter with preset. Odd case but
you never know.
* ti-ads1015
- Fix setting of comparator polarity by fixing bitfield definition.
* twl4030
- Error path handling fix to cleanup in event of regulator
registration failure.
- Disable the vusb3v1 regulator correctly in error handling
- Don't paper over a regulator enable failure.
Pull irq fixes from Ingo Molnar:
"Three irqchip driver fixes, and an affinity mask helper function bug
fix affecting x86"
* 'irq-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
Revert "genirq: Restrict effective affinity to interrupts actually using it"
irqchip.mips-gic: Fix shared interrupt mask writes
irqchip/gic-v4: Fix building with ancient gcc
irqchip/gic-v3: Iterate over possible CPUs by for_each_possible_cpu()
Pull address-limit checking fixes from Ingo Molnar:
"This fixes a number of bugs in the address-limit (USER_DS) checks that
got introduced in the merge window, (mostly) affecting the ARM and
ARM64 platforms"
* 'core-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
arm64/syscalls: Move address limit check in loop
arm/syscalls: Optimize address limit check
Revert "arm/syscalls: Check address limit on user-mode return"
syscalls: Use CHECK_DATA_CORRUPTION for addr_limit_user_check
Since most of the SD ADCs have the option of reseting the serial
interface by sending a number of SCLKs with CS = 0 and DIN = 1,
a dedicated function that can do this is usefull.
Needed for the patch: iio: ad7793: Fix the serial interface reset
Signed-off-by: Dragos Bogdan <dragos.bogdan@analog.com>
Acked-by: Lars-Peter Clausen <lars@metafoo.de>
Cc: <Stable@vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Pull networking fixes from David Miller:
1) Fix NAPI poll list corruption in enic driver, from Christian
Lamparter.
2) Fix route use after free, from Eric Dumazet.
3) Fix regression in reuseaddr handling, from Josef Bacik.
4) Assert the size of control messages in compat handling since we copy
it in from userspace twice. From Meng Xu.
5) SMC layer bug fixes (missing RCU locking, bad refcounting, etc.)
from Ursula Braun.
6) Fix races in AF_PACKET fanout handling, from Willem de Bruijn.
7) Don't use ARRAY_SIZE on spinlock array which might have zero
entries, from Geert Uytterhoeven.
8) Fix miscomputation of checksum in ipv6 udp code, from Subash Abhinov
Kasiviswanathan.
9) Push the ipv6 header properly in ipv6 GRE tunnel driver, from Xin
Long.
* git://git.kernel.org/pub/scm/linux/kernel/git/davem/net: (75 commits)
inet: fix improper empty comparison
net: use inet6_rcv_saddr to compare sockets
net: set tb->fast_sk_family
net: orphan frags on stand-alone ptype in dev_queue_xmit_nit
MAINTAINERS: update git tree locations for ieee802154 subsystem
net: prevent dst uses after free
net: phy: Fix truncation of large IRQ numbers in phy_attached_print()
net/smc: no close wait in case of process shut down
net/smc: introduce a delay
net/smc: terminate link group if out-of-sync is received
net/smc: longer delay for client link group removal
net/smc: adapt send request completion notification
net/smc: adjust net_device refcount
net/smc: take RCU read lock for routing cache lookup
net/smc: add receive timeout check
net/smc: add missing dev_put
net: stmmac: Cocci spatch "of_table"
lan78xx: Use default values loaded from EEPROM/OTP after reset
lan78xx: Allow EEPROM write for less than MAX_EEPROM_SIZE
lan78xx: Fix for eeprom read/write when device auto suspend
...
- Fix the initialization of resources in the ACPI WDAT watchdog
driver that uses unititialized memory which causes compiler
warnings to be triggered (Arnd Bergmann).
- Fix a recent regression in the ACPI device properties handling
that causes some device properties data to be skipped during
enumeration (Sakari Ailus).
- Fix a recent change in behavior that caused the ACPI_HANDLE()
macro to stop working for non-GPL code which is a problem for
the NVidia binary graphics driver, for example (John Hubbard).
- Add a MAINTAINERS entry for the ACPI PMIC drivers to specify
the official reviewers for that code (Rafael Wysocki).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZxYYnAAoJEILEb/54YlRx18AP/2PObVzO2qlf+U6Ikvgnl3cY
6ExyT7iQyTF8nQMdx2xITKU7PUd3EQBiFjsFgw/O5VvHsnT1a25nX8AXjEQ4X3UC
FyJvAh55D+3qWSle+OPHr0qdErtpNZsjTaGlYdHdqqWxgKbsgqZyRXN5XtbCytUh
Oa51G+fFAAC7zPvYzPiUGuUm39CzJQ97HtKUD43nDinu3ui2Tjutkw5HzZ5DF/3d
gQ7lgTDYBkgMrurGhrZNdB7rCQzn5QRKw7HOeWFqciqxTaREaKYzhruvGvJZgPbf
oY9/rFabC/okyVlc4oxAXkaqrZsuNRxhTSeqwSIG64Jfji2xnGDLpw6OP4S/ABJQ
t198pcbJVWMCsM7K6aEbVv0HixqzA1xIwqgmNPTTbmWS+SvtE2zuGKK378sxKLo0
SFqQE6Uh5Nux6oyVeSwQP5gCQIcOboHkmriCMg4gOGVwsg92Hvj6ymY5hTfJg7IO
6AeBTxQr4nNZTOsJAKr2qRaBwYRSzeSg/mtyW+l/BPD+kK0I5+rGVMD0xt36X4D/
bXF04gmRI1Rc2vsyHhGNCs83Q+BNdB4gNR9gynX2TCgSuVOvHOPJlI4YuYTlb5/1
DLL+s3If9Z0s9ZzxsHE5CgGZmLlthX/qbrxmMGY2zmtKcsFJLDfR9xVn4tzarpxK
/a4dsJMJZZ8oO7fe8GUK
=2LBb
-----END PGP SIGNATURE-----
Merge tag 'acpi-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull ACPI fixes from Rafael Wysocki:
"These fix the initialization of resources in the ACPI WDAT watchdog
driver, a recent regression in the ACPI device properties handling, a
recent change in behavior causing the ACPI_HANDLE() macro to only work
for GPL code and create a MAINTAINERS entry for ACPI PMIC drivers in
order to specify the official reviewers for that code.
Specifics:
- Fix the initialization of resources in the ACPI WDAT watchdog
driver that uses unititialized memory which causes compiler
warnings to be triggered (Arnd Bergmann).
- Fix a recent regression in the ACPI device properties handling that
causes some device properties data to be skipped during enumeration
(Sakari Ailus).
- Fix a recent change in behavior that caused the ACPI_HANDLE() macro
to stop working for non-GPL code which is a problem for the NVidia
binary graphics driver, for example (John Hubbard).
- Add a MAINTAINERS entry for the ACPI PMIC drivers to specify the
official reviewers for that code (Rafael Wysocki)"
* tag 'acpi-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
ACPI: properties: Return _DSD hierarchical extension (data) sub-nodes correctly
ACPI / bus: Make ACPI_HANDLE() work for non-GPL code again
ACPI / watchdog: properly initialize resources
ACPI / PMIC: Add code reviewers to MAINTAINERS
- Fix a regression in cpufreq on systems using DT as the source of
CPU configuration information where two different code paths
attempt to create the cpufreq-dt device object (there can be only
one) and fix up the "compatible" matching for some TI platforms
on top of that (Viresh Kumar, Dave Gerlach).
- Fix an initialization time memory leak in cpuidle on ARM which
occurs if the cpuidle driver initialization fails (Stefan Wahren).
- Fix a PM core function that checks whether or not there are any
system suspend/resume callbacks for a device, but forgets to
check legacy callbacks which then may be skipped incorrectly
and the system may crash and/or the device may become unusable
after a suspend-resume cycle (Rafael Wysocki).
- Fix request type validation for latency tolerance PM QoS requests
which may lead to unexpected behavior (Jan Schönherr).
- Fix a broken link to PM documentation from a header file and a
typo in a PM document (Geert Uytterhoeven, Rafael Wysocki).
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAABCAAGBQJZxYXOAAoJEILEb/54YlRxgkQP/1tQha6hykoryGscv1XGFPsc
VP7tpsf00gjMcUnbuRFbgD+jF60SAWtt48sIACUOqtxSXRTGdPHS8zggLo9pa0+A
A0Y8QlFXYHlk6up2iUyNOE0C2+UfD1pnh1fGfDtNLKob+hDhrrmRI2foMRVij9Cw
0+szDkUENcVwLcCzyWgv7uCdrG823kzBS0gCOJT4cpYid+s7lfFEwRZMz7m2Q/2Y
s0MMELdhBaaMAZyGIh28B2D+bAS1QTTxLr++nSiiIejVplraWlUYRToGlQMpZbJn
yPjKld5Zc6+VJ4sPl3oRw3wnfo4Rndwgx0VDGDdne1e0S7pQtIjNRbyGTbl2n6yS
+nckLpW767l9Mt9TVQzQBgnalUUZX78sqHqa3Ca9j8QXFcG0Khn5aewq76+9KVdv
uSAxUmthvErJUGa6GTkhBGywZxKtuIdA/eg23XDN/P2PSVW8hj+z4Fn8DylXSEo7
H3KMOJgSRWDFvAaYsusNEdQwy6+UWN9OiMoZYMVL6XcDYyFZrAaXtaZIb7MjOL5i
U9cv3QCU4CWtsbKoQb3qvQZudUa92MIHtLWILTuLT/lWPIPGiUGLbIYRE+hzRzkj
26J21Ex8veYuGEJT+YP6GBqtywX9aKuec2RZVSfqtAK/x/zXxwjFP6T3JHVril3T
/mxhLZIg/aJwIeJQ3JTG
=Cbgd
-----END PGP SIGNATURE-----
Merge tag 'pm-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management fixes from Rafael Wysocki:
"These fix a cpufreq regression introduced by recent changes related to
the generic DT driver, an initialization time memory leak in cpuidle
on ARM, a PM core bug that may cause system suspend/resume to fail on
some systems, a request type validation issue in the PM QoS framework
and two documentation-related issues.
Specifics:
- Fix a regression in cpufreq on systems using DT as the source of
CPU configuration information where two different code paths
attempt to create the cpufreq-dt device object (there can be only
one) and fix up the "compatible" matching for some TI platforms on
top of that (Viresh Kumar, Dave Gerlach).
- Fix an initialization time memory leak in cpuidle on ARM which
occurs if the cpuidle driver initialization fails (Stefan Wahren).
- Fix a PM core function that checks whether or not there are any
system suspend/resume callbacks for a device, but forgets to check
legacy callbacks which then may be skipped incorrectly and the
system may crash and/or the device may become unusable after a
suspend-resume cycle (Rafael Wysocki).
- Fix request type validation for latency tolerance PM QoS requests
which may lead to unexpected behavior (Jan Schönherr).
- Fix a broken link to PM documentation from a header file and a typo
in a PM document (Geert Uytterhoeven, Rafael Wysocki)"
* tag 'pm-4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
cpufreq: ti-cpufreq: Support additional am43xx platforms
ARM: cpuidle: Avoid memleak if init fail
cpufreq: dt-platdev: Add some missing platforms to the blacklist
PM: core: Fix device_pm_check_callbacks()
PM: docs: Drop an excess character from devices.rst
PM / QoS: Use the correct variable to check the QoS request type
driver core: Fix link to device power management documentation
Pull input fixes from Dmitry Torokhov:
- fixes for two long standing issues (lock up and a crash) in force
feedback handling in uinput driver
- tweak to firmware update timing in Elan I2C touchpad driver.
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/dtor/input:
Input: elan_i2c - extend Flash-Write delay
Input: uinput - avoid crash when sending FF request to device going away
Input: uinput - avoid FF flush when destroying device
- sysctl and seccomp operation to discover available actions. (tyhicks)
- new per-filter configurable logging infrastructure and sysctl. (tyhicks)
- SECCOMP_RET_LOG to log allowed syscalls. (tyhicks)
- SECCOMP_RET_KILL_PROCESS as the new strictest possible action.
- self-tests for new behaviors.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
Comment: Kees Cook <kees@outflux.net>
iQIcBAABCgAGBQJZxVbTAAoJEIly9N/cbcAmvIAQALR9aVQQXjma4lLhZxwTsLtG
rJm8t/o4y/2aBV8vzpFbMPT5gfN/PAkHJpCoxVPssx0k4PH2M7HjpnR6E1OC+erg
RNom3uNdNqZeFlDpdX1qriYiCTB9p6rHe0DPwgG9iGqgDxsJ+G3W+x1sMZ1C+A0M
shxA3fwt+Qpivo8Zq44xjMFjK+Zeor9V3yPc51QoZktWHlM16ID3HvHVnUtzqAUb
nTWF6ZlmZlJ/lp4Dq8/55lytVcXPo240G3H0Odai+SNFakK6p5UO//BRBV209bmb
05jpAOH6uym1sxVz00TQXCtDqOEzs2mQgomtTSShHg8SrLFX7nFkEFtAVA6tEri2
FqDYce9KX7ZtOYiq83C7pnpAFCouc0z31dQl9USHiAiexXklwBIX+OsVv98omWGi
pW43uLE2ovY0cpOsN50xI4mnxiGh6MhFcdbor2VLRJwLIFSw3XjjgNCCLyK4AJxs
N514252qi70c9cWyAHYDLy077yTVxu3JUlsVQKtRTMfoFUq6bX1jPXVXE8qkVrui
bc/Ay54pPrUwM854IpQ9ZBOuMfs6I5opocGIsBvMaND45U4o2B0ANCsxhuZ0zEtM
E55DhK5OgjukNemQmlWK2foDckYdtkJXCj2yMBNQady0Uynr2BWZ6VDBP7vFcnRB
UihRlFZRZleu8383uHsc
=sKeC
-----END PGP SIGNATURE-----
Merge tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux
Pull seccomp updates from Kees Cook:
"Major additions:
- sysctl and seccomp operation to discover available actions
(tyhicks)
- new per-filter configurable logging infrastructure and sysctl
(tyhicks)
- SECCOMP_RET_LOG to log allowed syscalls (tyhicks)
- SECCOMP_RET_KILL_PROCESS as the new strictest possible action
- self-tests for new behaviors"
[ This is the seccomp part of the security pull request during the merge
window that was nixed due to unrelated problems - Linus ]
* tag 'seccomp-v4.14-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
samples: Unrename SECCOMP_RET_KILL
selftests/seccomp: Test thread vs process killing
seccomp: Implement SECCOMP_RET_KILL_PROCESS action
seccomp: Introduce SECCOMP_RET_KILL_PROCESS
seccomp: Rename SECCOMP_RET_KILL to SECCOMP_RET_KILL_THREAD
seccomp: Action to log before allowing
seccomp: Filter flag to log all actions except SECCOMP_RET_ALLOW
seccomp: Selftest for detection of filter flag support
seccomp: Sysctl to configure actions that are allowed to be logged
seccomp: Operation for checking if an action is available
seccomp: Sysctl to display available actions
seccomp: Provide matching filter for introspection
selftests/seccomp: Refactor RET_ERRNO tests
selftests/seccomp: Add simple seccomp overhead benchmark
selftests/seccomp: Add tests for basic ptrace actions
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABAgAGBQJZxSV8AAoJELDendYovxMvf5YH/jUEgeFVgP0KRjNsvgp4gu88
4BW7nWYtFt4gGE1KnrKEPbg5Je0OwkpW7vUXxvLwDGWymtHMzVuuB2xxwzkePyzS
17Kzmb/JiuaNpVF4+5v3JvAw3b9iHrZ7T6cXOQgm28agd3m/y+9FSyzoMoNNRdGG
xURwUyK1idRqtkQV5VsQAK0Z1lVF7YhhaxWXBtClsqnKWoeLBLc8fpRJmUNruA33
E2Sdi06mNNN3xudu1s2edC5hAO4EgVKmonnmyHRYonIYwuqSND8fhEXj+PRdHj7s
lLVRQixd3raBiSscLASaQ7I/66frBm+TXzmoHAVtYkdlXBJlisTIvQlMPtAwu60=
=c3HX
-----END PGP SIGNATURE-----
Merge tag 'for-linus-4.14b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"A fix for a missing __init annotation and two cleanup patches"
* tag 'for-linus-4.14b-rc2-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
xen, arm64: drop dummy lookup_address()
xen: don't compile pv-specific parts if XEN_PV isn't configured
xen: x86: mark xen_find_pt_base as __init
In linux-4.13, Wei worked hard to convert dst to a traditional
refcounted model, removing GC.
We now want to make sure a dst refcount can not transition from 0 back
to 1.
The problem here is that input path attached a not refcounted dst to an
skb. Then later, because packet is forwarded and hits skb_dst_force()
before exiting RCU section, we might try to take a refcount on one dst
that is about to be freed, if another cpu saw 1 -> 0 transition in
dst_release() and queued the dst for freeing after one RCU grace period.
Lets unify skb_dst_force() and skb_dst_force_safe(), since we should
always perform the complete check against dst refcount, and not assume
it is not zero.
Bugzilla : https://bugzilla.kernel.org/show_bug.cgi?id=197005
[ 989.919496] skb_dst_force+0x32/0x34
[ 989.919498] __dev_queue_xmit+0x1ad/0x482
[ 989.919501] ? eth_header+0x28/0xc6
[ 989.919502] dev_queue_xmit+0xb/0xd
[ 989.919504] neigh_connected_output+0x9b/0xb4
[ 989.919507] ip_finish_output2+0x234/0x294
[ 989.919509] ? ipt_do_table+0x369/0x388
[ 989.919510] ip_finish_output+0x12c/0x13f
[ 989.919512] ip_output+0x53/0x87
[ 989.919513] ip_forward_finish+0x53/0x5a
[ 989.919515] ip_forward+0x2cb/0x3e6
[ 989.919516] ? pskb_trim_rcsum.part.9+0x4b/0x4b
[ 989.919518] ip_rcv_finish+0x2e2/0x321
[ 989.919519] ip_rcv+0x26f/0x2eb
[ 989.919522] ? vlan_do_receive+0x4f/0x289
[ 989.919523] __netif_receive_skb_core+0x467/0x50b
[ 989.919526] ? tcp_gro_receive+0x239/0x239
[ 989.919529] ? inet_gro_receive+0x226/0x238
[ 989.919530] __netif_receive_skb+0x4d/0x5f
[ 989.919532] netif_receive_skb_internal+0x5c/0xaf
[ 989.919533] napi_gro_receive+0x45/0x81
[ 989.919536] ixgbe_poll+0xc8a/0xf09
[ 989.919539] ? kmem_cache_free_bulk+0x1b6/0x1f7
[ 989.919540] net_rx_action+0xf4/0x266
[ 989.919543] __do_softirq+0xa8/0x19d
[ 989.919545] irq_exit+0x5d/0x6b
[ 989.919546] do_IRQ+0x9c/0xb5
[ 989.919548] common_interrupt+0x93/0x93
[ 989.919548] </IRQ>
Similarly dst_clone() can use dst_hold() helper to have additional
debugging, as a follow up to commit 44ebe79149 ("net: add debug
atomic_inc_not_zero() in dst_hold()")
In net-next we will convert dst atomic_t to refcount_t for peace of
mind.
Fixes: a4c2fd7f78 ("net: remove DST_NOCACHE flag")
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Wei Wang <weiwan@google.com>
Reported-by: Paweł Staszewski <pstaszewski@itcare.pl>
Bisected-by: Paweł Staszewski <pstaszewski@itcare.pl>
Acked-by: Wei Wang <weiwan@google.com>
Acked-by: Martin KaFai Lau <kafai@fb.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
Normally, when input device supporting force feedback effects is being
destroyed, we try to "flush" currently playing effects, so that the
physical device does not continue vibrating (or executing other effects).
Unfortunately this does not work well for uinput as flushing of the effects
deadlocks with the destroy action:
- if device is being destroyed because the file descriptor is being closed,
then there is noone to even service FF requests;
- if device is being destroyed because userspace sent UI_DEV_DESTROY,
while theoretically it could be possible to service FF requests,
userspace is unlikely to do so (they'd need to make sure FF handling
happens on a separate thread) even if kernel solves the issue with FF
ioctls deadlocking with UI_DEV_DESTROY ioctl on udev->mutex.
To avoid lockups like the one below, let's install a custom input device
flush handler, and avoid trying to flush force feedback effects when we
destroying the device, and instead rely on uinput to shut off the device
properly.
NMI watchdog: Watchdog detected hard LOCKUP on cpu 3
...
<<EOE>> [<ffffffff817a0307>] _raw_spin_lock_irqsave+0x37/0x40
[<ffffffff810e633d>] complete+0x1d/0x50
[<ffffffffa00ba08c>] uinput_request_done+0x3c/0x40 [uinput]
[<ffffffffa00ba587>] uinput_request_submit.part.7+0x47/0xb0 [uinput]
[<ffffffffa00bb62b>] uinput_dev_erase_effect+0x5b/0x76 [uinput]
[<ffffffff815d91ad>] erase_effect+0xad/0xf0
[<ffffffff815d929d>] flush_effects+0x4d/0x90
[<ffffffff815d4cc0>] input_flush_device+0x40/0x60
[<ffffffff815daf1c>] evdev_cleanup+0xac/0xc0
[<ffffffff815daf5b>] evdev_disconnect+0x2b/0x60
[<ffffffff815d74ac>] __input_unregister_device+0xac/0x150
[<ffffffff815d75f7>] input_unregister_device+0x47/0x70
[<ffffffffa00bac45>] uinput_destroy_device+0xb5/0xc0 [uinput]
[<ffffffffa00bb2de>] uinput_ioctl_handler.isra.9+0x65e/0x740 [uinput]
[<ffffffff811231ab>] ? do_futex+0x12b/0xad0
[<ffffffffa00bb3f8>] uinput_ioctl+0x18/0x20 [uinput]
[<ffffffff81241248>] do_vfs_ioctl+0x298/0x480
[<ffffffff81337553>] ? security_file_ioctl+0x43/0x60
[<ffffffff812414a9>] SyS_ioctl+0x79/0x90
[<ffffffff817a04ee>] entry_SYSCALL_64_fastpath+0x12/0x71
Reported-by: Rodrigo Rivas Costa <rodrigorivascosta@gmail.com>
Reported-by: Clément VUCHENER <clement.vuchener@gmail.com>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=193741
Signed-off-by: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Commit 3f1ac7a700 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API")
deprecated the ethtool_cmd::transceiver field, which was fine in
premise, except that the PHY library was actually using it to report the
type of transceiver: internal or external.
Use the first word of the reserved field to put this __u8 transceiver
field back in. It is made read-only, and we don't expect the
ETHTOOL_xLINKSETTINGS API to be doing anything with this anyway, so this
is mostly for the legacy path where we do:
ethtool_get_settings()
-> dev->ethtool_ops->get_link_ksettings()
-> convert_link_ksettings_to_legacy_settings()
to have no information loss compared to the legacy get_settings API.
Fixes: 3f1ac7a700 ("net: ethtool: add new ETHTOOL_xLINKSETTINGS API")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
Signed-off-by: David S. Miller <davem@davemloft.net>
This reverts commit 74def747bc.
The change to the helper function is only correct for the /proc/irq/
readout usage, but breaks the existing x86 usage of that function.
Reported-by: Yanko Kaneti <yaneti@declera.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>