871693 Commits

Author SHA1 Message Date
Linus Torvalds
fe7d2c23d7 platform-drivers-x86 for v5.4-3
Users of Intel P-Unit IPC driver might be surprised by harmless warning.
 Thus, switch to API which doesn't issue a warning at all.
 
 I²C multi-instantiate driver continues to add slave devices even when IRQ
 resource is not found. For devices in the market IRQ resource is mandatory,
 so, fail the ->probe() of the parent driver to avoid slaves being probed.
 
 Avoid compiler warning due to unused variable in Classmate laptop driver.
 
 The following is an automated git shortlog grouped by driver:
 
 classmate-laptop:
  -  remove unused variable
 
 i2c-multi-instantiate:
  -  Fail the probe if no IRQ provided
 
 intel_punit_ipc:
  -  Avoid error message when retrieving IRQ
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEqaflIX74DDDzMJJtb7wzTHR8rCgFAl2oJjgACgkQb7wzTHR8
 rCiqIhAAwS3WO6HJultOU75VE4IvHGyaXItnq7wm/tiJm5kEWUgVLid9+p/nwax2
 nLQjY1OlP50p7rdZWoIOLB9oZfJmOjzOSr7chxQjKVN1Nvwq1y7cBhNgZX/3d1fk
 THZcA8RbXMnVpmUNbuAK7IW5H6mh82J4jD7JT9kI75JQCs9+42uoTJBYbRnE2H6j
 3V6WSV8y5r0mIxumnd7JyEgpSkDF31OG5r1jJvOG9oHxszPxEOmLzOFw2sBQLcLz
 oVJwAOI3ChSA1bOKFveAiDUtZo1Ndvy1c0c1ce0SWrdCecDudKmp8mdvExl/R9OW
 OJOTmTfDuTvdO2TJ9LEJxwOvdox1LNQNQk2hlSktExePR9xptIfOcV7PWqcl33sR
 qfIF3P0XR80Nwy5DVhZYZCoIa2TYrkP4sYBck9mvorB68D4tGxDeohFmupSp/2T8
 InK6OGym7u3VMeAdOyJoOais7uvjWqNgRBZcV9zi5BZmOdMwflNOFrLx7JWQmk4B
 Sw6Av7z0RvKamOaPJJr+wzIfqs+Hc3WTO1haU3kk8BMHnf5jdPQ9v1misF1ZdoKh
 0/jnPSYYT0zAS60k5dXW+LNt77KY+bN5Jai2WnTUnI85jqeeWiiS5cusuIMcvG9M
 n0Qxd20KBB2hgo662/fWe+U/KhmqdDVPmOuOfp/xBBYWJt/lWT0=
 =dfns
 -----END PGP SIGNATURE-----

Merge tag 'platform-drivers-x86-v5.4-3' of git://git.infradead.org/linux-platform-drivers-x86

Pull x86 platform driver fixes from Andy Shevchenko:

 - Users of Intel P-Unit IPC driver might be surprised by harmless
   warning. Thus, switch to API which doesn't issue a warning at all.

 - I²C multi-instantiate driver continues to add slave devices even when
   IRQ resource is not found. For devices in the market IRQ resource is
   mandatory, so, fail the ->probe() of the parent driver to avoid
   slaves being probed.

 - Avoid compiler warning due to unused variable in Classmate laptop
   driver.

* tag 'platform-drivers-x86-v5.4-3' of git://git.infradead.org/linux-platform-drivers-x86:
  platform/x86: i2c-multi-instantiate: Fail the probe if no IRQ provided
  platform/x86: intel_punit_ipc: Avoid error message when retrieving IRQ
  platform/x86: classmate-laptop: remove unused variable
2019-10-17 08:31:03 -07:00
Linus Torvalds
7801158f83 GPIO fixes for the v5.4 series
The fixes pertain to a problem with initializing the Intel GPIO
 irqchips when adding gpiochips, Andy fixed it up elegantly by
 adding a hardware initialization callback to the struct
 gpio_irq_chip so let's use this. Tested and verified on the
 target hardware.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAl2oFDQACgkQQRCzN7AZ
 XXMqcxAAoFftmi61CZL5EH9oZNMD5ebg8ovBizNbP5KmQ6oAxv7RkX7TJpdAysAF
 FpzmKIeR/hSiZ6t046xS2M1zBUHya9sb/X7Nlt7n/ZDbEC/aCvbRKaxADEdR7flQ
 qzB2u/ZZ25yknTuCxu17Yvo1Bg1T1qEXr+36UJhw8tWDAt60UUi893o6qbnixpJ5
 qaQc2IBrSsTStxnT4Tfom1nJ86gvCRgN1lKTCua0rlKVOtSO2MXf7ZlKz0eJBGpa
 CNML7esE+/+4Qe6QM5KY6Rw9V0jO916N7xPoX2lVnF+HWjCc7uNmB7cWHMk55iuZ
 NmkLfrkEgRwWjWTRXinlqIWTeFNWfrIsDjRbS+dCxZKxRZutPnUgNq8nrA51tD8P
 qVdc29gk5myGKYCjc9onbDeJT6dgbf4ck/lYmumxp/EVlHIWFL5gCi1OwV0Nn80C
 5KFyLvRf2dyKynaZSebTbdqnds4LHK3c49psnmye2eMKv5e7jCPeSzKgsYeCqT/i
 SAa0gFhIGUd5e8BxiLwYf6rjCjoa6jXDz6fkCmSGDPU6iycp+wh2V4fSuXn6zBD/
 wAm67n2QwcKsLWxvSTnCpcbSUt0eykPXHunswAflV+HDMOgmTz40+9F9Ymvcu+B/
 H34k89WBDKWPjePDMrFw2su2EeZFo2o6HFJV8JuJzXn7xGHEpA8=
 =tDS5
 -----END PGP SIGNATURE-----

Merge tag 'gpio-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio

Pull GPIO fixes from Linus Walleij:
 "The fixes pertain to a problem with initializing the Intel GPIO
  irqchips when adding gpiochips.

  Andy fixed it up elegantly by adding a hardware initialization
  callback to the struct gpio_irq_chip so let's use this. Tested and
  verified on the target hardware"

* tag 'gpio-v5.4-3' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-gpio:
  gpio: lynxpoint: set default handler to be handle_bad_irq()
  gpio: merrifield: Move hardware initialization to callback
  gpio: lynxpoint: Move hardware initialization to callback
  gpio: intel-mid: Move hardware initialization to callback
  gpiolib: Initialize the hardware with a callback
  gpio: merrifield: Restore use of irq_base
2019-10-17 08:08:20 -07:00
Ben Dooks
bc88f85c6c kthread: make __kthread_queue_delayed_work static
The __kthread_queue_delayed_work is not exported so
make it static, to avoid the following sparse warning:

  kernel/kthread.c:869:6: warning: symbol '__kthread_queue_delayed_work' was not declared. Should it be static?

Signed-off-by: Ben Dooks <ben.dooks@codethink.co.uk>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-16 09:20:58 -07:00
Linus Torvalds
3b1f00aceb virtio: fixes
Some minor bugfixes
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQEcBAABAgAGBQJdpjWIAAoJECgfDbjSjVRpT7AH/0bt4Oi8XR1+d0XT2B/fxPqZ
 JodVrShmq+CmgAYdA1en+DXfZHspCMFVxWT+DHAcVoynWY8VVK4s/0uww1iXqPHG
 dGOneAdo2kkMfP66KgQXn75vEyDyYPl9NY/TfhWZMD9dVOsqNp4BTVKHoXR5LlIx
 j/r74bCpXGIs/3ZRz8IvO2F3uuvrgjNVTZ9mdlAcBtXx4Qe+X6LoR0YREpMAxhVo
 dD4crD6XWm4tneeQ5nzqhA8uzqasbVZY8afuolZR+ZhplHldBNHwZu21aieN3Lfj
 cVD73Ukarna8hw4szsdTTOBlbxPHAqasAcbW1rU464zy1g1NHDdT/NhvavZlvLo=
 =GSKw
 -----END PGP SIGNATURE-----

Merge tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost

Pull virtio fixes from Michael Tsirkin:
 "Some minor bugfixes"

* tag 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mst/vhost:
  vhost/test: stop device before reset
  tools/virtio: xen stub
  tools/virtio: more stubs
2019-10-15 14:50:10 -07:00
Linus Torvalds
8625732e77 SCSI fixes on 20191015
Five changes, two in drivers (qla2xxx, zfcp), one to MAINTAINERS
 (qla2xxx) and two in the core.  The last two are mostly about removing
 incorrect messages from the kernel log: the resid message is
 definitely wrong and the sync cache on protected drive problem is
 arguably wrong.
 
 Signed-off-by: James E.J. Bottomley <jejb@linux.ibm.com>
 -----BEGIN PGP SIGNATURE-----
 
 iJwEABMIAEQWIQTnYEDbdso9F2cI+arnQslM7pishQUCXaYZYCYcamFtZXMuYm90
 dG9tbGV5QGhhbnNlbnBhcnRuZXJzaGlwLmNvbQAKCRDnQslM7pishVuDAP9HBhGv
 dQ3FPA7gZ33rmsb8M1Q1NJ0GJuvFj2muh9CFYwD6AoJtVLivVZR75gojLLMqKpuf
 6EwRTaUZwYAoWeILNuA=
 =+iTy
 -----END PGP SIGNATURE-----

Merge tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi

Pull SCSI fixes from James Bottomley:
 "Five changes, two in drivers (qla2xxx, zfcp), one to MAINTAINERS
  (qla2xxx) and two in the core.

  The last two are mostly about removing incorrect messages from the
  kernel log: the resid message is definitely wrong and the sync cache
  on protected drive problem is arguably wrong"

* tag 'scsi-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi:
  scsi: MAINTAINERS: Update qla2xxx driver
  scsi: zfcp: fix reaction on bit error threshold notification
  scsi: core: save/restore command resid for error handling
  scsi: qla2xxx: Remove WARN_ON_ONCE in qla2x00_status_cont_entry()
  scsi: sd: Ignore a failure to sync cache due to lack of authorization
2019-10-15 12:19:08 -07:00
Linus Torvalds
8e0d0ad206 sparc64: disable fast-GUP due to unexplained oopses
HAVE_FAST_GUP enables the lockless quick page table walker for simple
cases, and is a nice optimization for some random loads that can then
use get_user_pages_fast() rather than the more careful page walker.

However, for some unexplained reason, it seems to be subtly broken on
sparc64.  The breakage is only with some compiler versions and some
hardware, and nobody seems to have figured out what triggers it,
although there's a simple reprodicer for the problem when it does
trigger.

The problem was introduced with the conversion to the generic GUP code
in commit 7b9afb86b632 ("sparc64: use the generic get_user_pages_fast
code"), but nothing looks obviously wrong in that conversion.  It may be
a compiler bug that just hits us with the code reorganization.  Or it
may be something very specific to sparc64.

This disables HAVE_FAST_GUP entirely.  That makes things like futexes a
bit slower, but at least they work.  If we can figure out the trigger,
that would be lovely, but it's been three months already..

Link: https://lore.kernel.org/lkml/20190717215956.GA30369@altlinux.org/
Fixes: 7b9afb86b632 ("sparc64: use the generic get_user_pages_fast code")
Reported-by: Dmitry V Levin <ldv@altlinux.org>
Reported-by: Anatoly Pugachev <matorola@gmail.com>
Requested-by: Meelis Roos <mroos@linux.ee>
Suggested-by: Christoph Hellwig <hch@infradead.org>
Cc: David Miller <davem@davemloft.net>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-15 09:56:36 -07:00
Linus Torvalds
02755af0f3 Merge branch 'parisc-5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux
Pull parisc fixes from Helge Deller:

 - Fix a parisc-specific fallout of Christoph's
   dma_set_mask_and_coherent() patches (Sven)

 - Fix a vmap memory leak in ioremap()/ioremap() (Helge)

 - Some minor cleanups and documentation updates (Nick, Helge)

* 'parisc-5.4-2' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
  parisc: Remove 32-bit DMA enforcement from sba_iommu
  parisc: Fix vmap memory leak in ioremap()/iounmap()
  parisc: prefer __section from compiler_attributes.h
  parisc: sysctl.c: Use CONFIG_PARISC instead of __hppa_ define
  MAINTAINERS: Add hp_sdc drivers to parisc arch
2019-10-15 09:37:01 -07:00
Linus Torvalds
37b238da15 Merge branch 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging
Pull dmi fix from Jean Delvare.

* 'dmi-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jdelvare/staging:
  firmware: dmi: Fix unlikely out-of-bounds read in save_mem_devices
2019-10-15 09:20:07 -07:00
Linus Torvalds
5bc52f64e8 Merge branch 'akpm' (patches from Andrew)
Merge more fixes from Andrew Morton:
 "The usual shower of hotfixes and some followups to the recently merged
  page_owner enhancements"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
  mm/memory-failure: poison read receives SIGKILL instead of SIGBUS if mmaped more than once
  mm/slab.c: fix kernel-doc warning for __ksize()
  xarray.h: fix kernel-doc warning
  bitmap.h: fix kernel-doc warning and typo
  fs/fs-writeback.c: fix kernel-doc warning
  fs/libfs.c: fix kernel-doc warning
  fs/direct-io.c: fix kernel-doc warning
  mm, compaction: fix wrong pfn handling in __reset_isolation_pfn()
  mm, hugetlb: allow hugepage allocations to reclaim as needed
  lib/test_meminit: add a kmem_cache_alloc_bulk() test
  mm/slub.c: init_on_free=1 should wipe freelist ptr for bulk allocations
  lib/generic-radix-tree.c: add kmemleak annotations
  mm/slub: fix a deadlock in show_slab_objects()
  mm, page_owner: rename flag indicating that page is allocated
  mm, page_owner: decouple freeing stack trace from debug_pagealloc
  mm, page_owner: fix off-by-one error in __set_page_owner_handle()
2019-10-14 16:49:59 -07:00
Andy Shevchenko
75e99bf5ed gpio: lynxpoint: set default handler to be handle_bad_irq()
We switch the default handler to be handle_bad_irq() instead of
handle_simple_irq() (which was not correct anyway).

Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-10-15 01:19:05 +02:00
Andy Shevchenko
4c87540940 gpio: merrifield: Move hardware initialization to callback
The driver wants to initialize related registers before IRQ chip will be added.
That's why move it to a corresponding callback. It also fixes the NULL pointer
dereference.

Fixes: 8f86a5b4ad67 ("gpio: merrifield: Pass irqchip when adding gpiochip")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-10-15 01:19:01 +02:00
Andy Shevchenko
a339120616 gpio: lynxpoint: Move hardware initialization to callback
The driver wants to initialize related registers before IRQ chip will be added.
That's why move it to a corresponding callback. It also fixes the NULL pointer
dereference.

Fixes: 7b1e889436a1 ("gpio: lynxpoint: Pass irqchip when adding gpiochip")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-10-15 01:18:57 +02:00
Andy Shevchenko
a752fbb4b4 gpio: intel-mid: Move hardware initialization to callback
The driver wants to initialize related registers before IRQ chip will be added.
That's why move it to a corresponding callback. It also fixes the NULL pointer
dereference.

Fixes: 8069e69a9792 ("gpio: intel-mid: Pass irqchip when adding gpiochip")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-10-15 01:18:51 +02:00
Andy Shevchenko
9411e3aaa6 gpiolib: Initialize the hardware with a callback
After changing the drivers to use GPIO core to add an IRQ chip
it appears that some of them requires a hardware initialization
before adding the IRQ chip.

Add an optional callback ->init_hw() to allow that drivers
to initialize hardware if needed.

This change is a part of the fix NULL pointer dereference
brought to the several drivers recently.

Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-10-15 01:18:46 +02:00
Andy Shevchenko
6658f87f21 gpio: merrifield: Restore use of irq_base
During conversion to internal IRQ chip initialization the commit
  8f86a5b4ad67 ("gpio: merrifield: Pass irqchip when adding gpiochip")
lost the irq_base assignment.

drivers/gpio/gpio-merrifield.c: In function ‘mrfld_gpio_probe’:
drivers/gpio/gpio-merrifield.c:405:17: warning: variable ‘irq_base’ set but not used [-Wunused-but-set-variable]

Assign the girq->first to it.

Fixes: 8f86a5b4ad67 ("gpio: merrifield: Pass irqchip when adding gpiochip")
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
2019-10-15 01:18:15 +02:00
Jane Chu
3d7fed4ad8 mm/memory-failure: poison read receives SIGKILL instead of SIGBUS if mmaped more than once
Mmap /dev/dax more than once, then read the poison location using
address from one of the mappings.  The other mappings due to not having
the page mapped in will cause SIGKILLs delivered to the process.
SIGKILL succeeds over SIGBUS, so user process loses the opportunity to
handle the UE.

Although one may add MAP_POPULATE to mmap(2) to work around the issue,
MAP_POPULATE makes mapping 128GB of pmem several magnitudes slower, so
isn't always an option.

Details -

  ndctl inject-error --block=10 --count=1 namespace6.0

  ./read_poison -x dax6.0 -o 5120 -m 2
  mmaped address 0x7f5bb6600000
  mmaped address 0x7f3cf3600000
  doing local read at address 0x7f3cf3601400
  Killed

Console messages in instrumented kernel -

  mce: Uncorrected hardware memory error in user-access at edbe201400
  Memory failure: tk->addr = 7f5bb6601000
  Memory failure: address edbe201: call dev_pagemap_mapping_shift
  dev_pagemap_mapping_shift: page edbe201: no PUD
  Memory failure: tk->size_shift == 0
  Memory failure: Unable to find user space address edbe201 in read_poison
  Memory failure: tk->addr = 7f3cf3601000
  Memory failure: address edbe201: call dev_pagemap_mapping_shift
  Memory failure: tk->size_shift = 21
  Memory failure: 0xedbe201: forcibly killing read_poison:22434 because of failure to unmap corrupted page
    => to deliver SIGKILL
  Memory failure: 0xedbe201: Killing read_poison:22434 due to hardware memory corruption
    => to deliver SIGBUS

Link: http://lkml.kernel.org/r/1565112345-28754-3-git-send-email-jane.chu@oracle.com
Signed-off-by: Jane Chu <jane.chu@oracle.com>
Suggested-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:01 -07:00
Randy Dunlap
87bf4f71af mm/slab.c: fix kernel-doc warning for __ksize()
Fix kernel-doc warning in mm/slab.c:

  mm/slab.c:4215: warning: Function parameter or member 'objp' not described in '__ksize'

Also add Return: documentation section for this function.

Link: http://lkml.kernel.org/r/68c9fd7d-f09e-d376-e292-c7b2bdf1774d@infradead.org
Fixes: 10d1f8cb3965 ("mm/slab: refactor common ksize KASAN logic into slab_common.c")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Acked-by: Marco Elver <elver@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:01 -07:00
Randy Dunlap
13bea898cd xarray.h: fix kernel-doc warning
Fix (Sphinx) kernel-doc warning in <linux/xarray.h>:

  include/linux/xarray.h:232: WARNING: Unexpected indentation.

Link: http://lkml.kernel.org/r/89ba2134-ce23-7c10-5ee1-ef83b35aa984@infradead.org
Fixes: a3e4d3f97ec8 ("XArray: Redesign xa_alloc API")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Matthew Wilcox <willy@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:01 -07:00
Randy Dunlap
2a7e582f42 bitmap.h: fix kernel-doc warning and typo
Fix kernel-doc warning in <linux/bitmap.h>:

  include/linux/bitmap.h:341: warning: Function parameter or member 'nbits' not described in 'bitmap_or_equal'

Also fix small typo (bitnaps).

Link: http://lkml.kernel.org/r/0729ea7a-2c0d-b2c5-7dd3-3629ee0803e2@infradead.org
Fixes: b9fa6442f704 ("cpumask: Implement cpumask_or_equal()")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:01 -07:00
Randy Dunlap
b46ec1da5e fs/fs-writeback.c: fix kernel-doc warning
Fix kernel-doc warning in fs/fs-writeback.c:

  fs/fs-writeback.c:913: warning: Excess function parameter 'nr_pages' description in 'cgroup_writeback_by_id'

Link: http://lkml.kernel.org/r/756645ac-0ce8-d47e-d30a-04d9e4923a4f@infradead.org
Fixes: d62241c7a406 ("writeback, memcg: Implement cgroup_writeback_by_id()")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Tejun Heo <tj@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:01 -07:00
Randy Dunlap
8e88bfba77 fs/libfs.c: fix kernel-doc warning
Fix kernel-doc warning in fs/libfs.c:

  fs/libfs.c:496: warning: Excess function parameter 'available' description in 'simple_write_end'

Link: http://lkml.kernel.org/r/5fc9d70b-e377-0ec9-066a-970d49579041@infradead.org
Fixes: ad2a722f196d ("libfs: Open code simple_commit_write into only user")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Boaz Harrosh <boazh@netapp.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:01 -07:00
Randy Dunlap
c70d868f27 fs/direct-io.c: fix kernel-doc warning
Fix kernel-doc warning in fs/direct-io.c:

  fs/direct-io.c:258: warning: Excess function parameter 'offset' description in 'dio_complete'

Also, don't mark this function as having kernel-doc notation since it is
not exported.

Link: http://lkml.kernel.org/r/97908511-4328-4a56-17fe-f43a1d7aa470@infradead.org
Fixes: 6d544bb4d901 ("dio: centralize completion in dio_complete()")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Zach Brown <zab@zabbo.net>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:01 -07:00
Vlastimil Babka
a2e9a5afce mm, compaction: fix wrong pfn handling in __reset_isolation_pfn()
Florian and Dave reported [1] a NULL pointer dereference in
__reset_isolation_pfn().  While the exact cause is unclear, staring at
the code revealed two bugs, which might be related.

One bug is that if zone starts in the middle of pageblock, block_page
might correspond to different pfn than block_pfn, and then the
pfn_valid_within() checks will check different pfn's than those accessed
via struct page.  This might result in acessing an unitialized page in
CONFIG_HOLES_IN_ZONE configs.

The other bug is that end_page refers to the first page of next
pageblock and not last page of current pageblock.  The online and valid
check is then wrong and with sections, the while (page < end_page) loop
might wander off actual struct page arrays.

[1] https://lore.kernel.org/linux-xfs/87o8z1fvqu.fsf@mid.deneb.enyo.de/

Link: http://lkml.kernel.org/r/20191008152915.24704-1-vbabka@suse.cz
Fixes: 6b0868c820ff ("mm/compaction.c: correct zone boundary handling when resetting pageblock skip hints")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Florian Weimer <fw@deneb.enyo.de>
Reported-by: Dave Chinner <david@fromorbit.com>
Acked-by: Mel Gorman <mgorman@techsingularity.net>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:01 -07:00
David Rientjes
3f36d86694 mm, hugetlb: allow hugepage allocations to reclaim as needed
Commit b39d0ee2632d ("mm, page_alloc: avoid expensive reclaim when
compaction may not succeed") has chnaged the allocator to bail out from
the allocator early to prevent from a potentially excessive memory
reclaim.  __GFP_RETRY_MAYFAIL is designed to retry the allocation,
reclaim and compaction loop as long as there is a reasonable chance to
make forward progress.  Neither COMPACT_SKIPPED nor COMPACT_DEFERRED at
the INIT_COMPACT_PRIORITY compaction attempt gives this feedback.

The most obvious affected subsystem is hugetlbfs which allocates huge
pages based on an admin request (or via admin configured overcommit).  I
have done a simple test which tries to allocate half of the memory for
hugetlb pages while the memory is full of a clean page cache.  This is
not an unusual situation because we try to cache as much of the memory
as possible and sysctl/sysfs interface to allocate huge pages is there
for flexibility to allocate hugetlb pages at any time.

System has 1GB of RAM and we are requesting 515MB worth of hugetlb pages
after the memory is prefilled by a clean page cache:

  root@test1:~# cat hugetlb_test.sh

  set -x
  echo 0 > /proc/sys/vm/nr_hugepages
  echo 3 > /proc/sys/vm/drop_caches
  echo 1 > /proc/sys/vm/compact_memory
  dd if=/mnt/data/file-1G of=/dev/null bs=$((4<<10))
  TS=$(date +%s)
  echo 256 > /proc/sys/vm/nr_hugepages
  cat /proc/sys/vm/nr_hugepages

The results for 2 consecutive runs on clean 5.3

  root@test1:~# sh hugetlb_test.sh
  + echo 0
  + echo 3
  + echo 1
  + dd if=/mnt/data/file-1G of=/dev/null bs=4096
  262144+0 records in
  262144+0 records out
  1073741824 bytes (1.1 GB) copied, 21.0694 s, 51.0 MB/s
  + date +%s
  + TS=1569905284
  + echo 256
  + cat /proc/sys/vm/nr_hugepages
  256
  root@test1:~# sh hugetlb_test.sh
  + echo 0
  + echo 3
  + echo 1
  + dd if=/mnt/data/file-1G of=/dev/null bs=4096
  262144+0 records in
  262144+0 records out
  1073741824 bytes (1.1 GB) copied, 21.7548 s, 49.4 MB/s
  + date +%s
  + TS=1569905311
  + echo 256
  + cat /proc/sys/vm/nr_hugepages
  256

Now with b39d0ee2632d applied

  root@test1:~# sh hugetlb_test.sh
  + echo 0
  + echo 3
  + echo 1
  + dd if=/mnt/data/file-1G of=/dev/null bs=4096
  262144+0 records in
  262144+0 records out
  1073741824 bytes (1.1 GB) copied, 20.1815 s, 53.2 MB/s
  + date +%s
  + TS=1569905516
  + echo 256
  + cat /proc/sys/vm/nr_hugepages
  11
  root@test1:~# sh hugetlb_test.sh
  + echo 0
  + echo 3
  + echo 1
  + dd if=/mnt/data/file-1G of=/dev/null bs=4096
  262144+0 records in
  262144+0 records out
  1073741824 bytes (1.1 GB) copied, 21.9485 s, 48.9 MB/s
  + date +%s
  + TS=1569905541
  + echo 256
  + cat /proc/sys/vm/nr_hugepages
  12

The success rate went down by factor of 20!

Although hugetlb allocation requests might fail and it is reasonable to
expect them to under extremely fragmented memory or when the memory is
under a heavy pressure but the above situation is not that case.

Fix the regression by reverting back to the previous behavior for
__GFP_RETRY_MAYFAIL requests and disable the beail out heuristic for
those requests.

Mike said:

: hugetlbfs allocations are commonly done via sysctl/sysfs shortly after
: boot where this may not be as much of an issue.  However, I am aware of at
: least three use cases where allocations are made after the system has been
: up and running for quite some time:
:
: - DB reconfiguration.  If sysctl/sysfs fails to get required number of
:   huge pages, system is rebooted to perform allocation after boot.
:
: - VM provisioning.  If unable get required number of huge pages, fall
:   back to base pages.
:
: - An application that does not preallocate pool, but rather allocates
:   pages at fault time for optimal NUMA locality.
:
: In all cases, I would expect b39d0ee2632d to cause regressions and
: noticable behavior changes.
:
: My quick/limited testing in
: https://lkml.kernel.org/r/3468b605-a3a9-6978-9699-57c52a90bd7e@oracle.com
: was insufficient.  It was also mentioned that if something like
: b39d0ee2632d went forward, I would like exemptions for __GFP_RETRY_MAYFAIL
: requests as in this patch.

[mhocko@suse.com: reworded changelog]
Link: http://lkml.kernel.org/r/20191007075548.12456-1-mhocko@kernel.org
Fixes: b39d0ee2632d ("mm, page_alloc: avoid expensive reclaim when compaction may not succeed")
Signed-off-by: David Rientjes <rientjes@google.com>
Signed-off-by: Michal Hocko <mhocko@suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Mel Gorman <mgorman@suse.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:01 -07:00
Alexander Potapenko
03a9349ac0 lib/test_meminit: add a kmem_cache_alloc_bulk() test
Make sure allocations from kmem_cache_alloc_bulk() and
kmem_cache_free_bulk() are properly initialized.

Link: http://lkml.kernel.org/r/20191007091605.30530-2-glider@google.com
Signed-off-by: Alexander Potapenko <glider@google.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Thibaut Sautereau <thibaut@sautereau.fr>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:01 -07:00
Alexander Potapenko
0f181f9fbe mm/slub.c: init_on_free=1 should wipe freelist ptr for bulk allocations
slab_alloc_node() already zeroed out the freelist pointer if
init_on_free was on.  Thibaut Sautereau noticed that the same needs to
be done for kmem_cache_alloc_bulk(), which performs the allocations
separately.

kmem_cache_alloc_bulk() is currently used in two places in the kernel,
so this change is unlikely to have a major performance impact.

SLAB doesn't require a similar change, as auto-initialization makes the
allocator store the freelist pointers off-slab.

Link: http://lkml.kernel.org/r/20191007091605.30530-1-glider@google.com
Fixes: 6471384af2a6 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
Signed-off-by: Alexander Potapenko <glider@google.com>
Reported-by: Thibaut Sautereau <thibaut@sautereau.fr>
Reported-by: Kees Cook <keescook@chromium.org>
Cc: Christoph Lameter <cl@linux.com>
Cc: Laura Abbott <labbott@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:01 -07:00
Eric Biggers
3c52b0af05 lib/generic-radix-tree.c: add kmemleak annotations
Kmemleak is falsely reporting a leak of the slab allocation in
sctp_stream_init_ext():

  BUG: memory leak
  unreferenced object 0xffff8881114f5d80 (size 96):
   comm "syz-executor934", pid 7160, jiffies 4294993058 (age 31.950s)
   hex dump (first 32 bytes):
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   backtrace:
     [<00000000ce7a1326>] kmemleak_alloc_recursive  include/linux/kmemleak.h:55 [inline]
     [<00000000ce7a1326>] slab_post_alloc_hook mm/slab.h:439 [inline]
     [<00000000ce7a1326>] slab_alloc mm/slab.c:3326 [inline]
     [<00000000ce7a1326>] kmem_cache_alloc_trace+0x13d/0x280 mm/slab.c:3553
     [<000000007abb7ac9>] kmalloc include/linux/slab.h:547 [inline]
     [<000000007abb7ac9>] kzalloc include/linux/slab.h:742 [inline]
     [<000000007abb7ac9>] sctp_stream_init_ext+0x2b/0xa0  net/sctp/stream.c:157
     [<0000000048ecb9c1>] sctp_sendmsg_to_asoc+0x946/0xa00  net/sctp/socket.c:1882
     [<000000004483ca2b>] sctp_sendmsg+0x2a8/0x990 net/sctp/socket.c:2102
     [...]

But it's freed later.  Kmemleak misses the allocation because its
pointer is stored in the generic radix tree sctp_stream::out, and the
generic radix tree uses raw pages which aren't tracked by kmemleak.

Fix this by adding the kmemleak hooks to the generic radix tree code.

Link: http://lkml.kernel.org/r/20191004065039.727564-1-ebiggers@kernel.org
Signed-off-by: Eric Biggers <ebiggers@google.com>
Reported-by: <syzbot+7f3b6b106be8dcdcdeec@syzkaller.appspotmail.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner@gmail.com>
Acked-by: Neil Horman <nhorman@tuxdriver.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Cc: Kent Overstreet <kent.overstreet@gmail.com>
Cc: Vlad Yasevich <vyasevich@gmail.com>
Cc: Xin Long <lucien.xin@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:00 -07:00
Qian Cai
e4f8e513c3 mm/slub: fix a deadlock in show_slab_objects()
A long time ago we fixed a similar deadlock in show_slab_objects() [1].
However, it is apparently due to the commits like 01fb58bcba63 ("slab:
remove synchronous synchronize_sched() from memcg cache deactivation
path") and 03afc0e25f7f ("slab: get_online_mems for
kmem_cache_{create,destroy,shrink}"), this kind of deadlock is back by
just reading files in /sys/kernel/slab which will generate a lockdep
splat below.

Since the "mem_hotplug_lock" here is only to obtain a stable online node
mask while racing with NUMA node hotplug, in the worst case, the results
may me miscalculated while doing NUMA node hotplug, but they shall be
corrected by later reads of the same files.

  WARNING: possible circular locking dependency detected
  ------------------------------------------------------
  cat/5224 is trying to acquire lock:
  ffff900012ac3120 (mem_hotplug_lock.rw_sem){++++}, at:
  show_slab_objects+0x94/0x3a8

  but task is already holding lock:
  b8ff009693eee398 (kn->count#45){++++}, at: kernfs_seq_start+0x44/0xf0

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #2 (kn->count#45){++++}:
         lock_acquire+0x31c/0x360
         __kernfs_remove+0x290/0x490
         kernfs_remove+0x30/0x44
         sysfs_remove_dir+0x70/0x88
         kobject_del+0x50/0xb0
         sysfs_slab_unlink+0x2c/0x38
         shutdown_cache+0xa0/0xf0
         kmemcg_cache_shutdown_fn+0x1c/0x34
         kmemcg_workfn+0x44/0x64
         process_one_work+0x4f4/0x950
         worker_thread+0x390/0x4bc
         kthread+0x1cc/0x1e8
         ret_from_fork+0x10/0x18

  -> #1 (slab_mutex){+.+.}:
         lock_acquire+0x31c/0x360
         __mutex_lock_common+0x16c/0xf78
         mutex_lock_nested+0x40/0x50
         memcg_create_kmem_cache+0x38/0x16c
         memcg_kmem_cache_create_func+0x3c/0x70
         process_one_work+0x4f4/0x950
         worker_thread+0x390/0x4bc
         kthread+0x1cc/0x1e8
         ret_from_fork+0x10/0x18

  -> #0 (mem_hotplug_lock.rw_sem){++++}:
         validate_chain+0xd10/0x2bcc
         __lock_acquire+0x7f4/0xb8c
         lock_acquire+0x31c/0x360
         get_online_mems+0x54/0x150
         show_slab_objects+0x94/0x3a8
         total_objects_show+0x28/0x34
         slab_attr_show+0x38/0x54
         sysfs_kf_seq_show+0x198/0x2d4
         kernfs_seq_show+0xa4/0xcc
         seq_read+0x30c/0x8a8
         kernfs_fop_read+0xa8/0x314
         __vfs_read+0x88/0x20c
         vfs_read+0xd8/0x10c
         ksys_read+0xb0/0x120
         __arm64_sys_read+0x54/0x88
         el0_svc_handler+0x170/0x240
         el0_svc+0x8/0xc

  other info that might help us debug this:

  Chain exists of:
    mem_hotplug_lock.rw_sem --> slab_mutex --> kn->count#45

   Possible unsafe locking scenario:

         CPU0                    CPU1
         ----                    ----
    lock(kn->count#45);
                                 lock(slab_mutex);
                                 lock(kn->count#45);
    lock(mem_hotplug_lock.rw_sem);

   *** DEADLOCK ***

  3 locks held by cat/5224:
   #0: 9eff00095b14b2a0 (&p->lock){+.+.}, at: seq_read+0x4c/0x8a8
   #1: 0eff008997041480 (&of->mutex){+.+.}, at: kernfs_seq_start+0x34/0xf0
   #2: b8ff009693eee398 (kn->count#45){++++}, at:
  kernfs_seq_start+0x44/0xf0

  stack backtrace:
  Call trace:
   dump_backtrace+0x0/0x248
   show_stack+0x20/0x2c
   dump_stack+0xd0/0x140
   print_circular_bug+0x368/0x380
   check_noncircular+0x248/0x250
   validate_chain+0xd10/0x2bcc
   __lock_acquire+0x7f4/0xb8c
   lock_acquire+0x31c/0x360
   get_online_mems+0x54/0x150
   show_slab_objects+0x94/0x3a8
   total_objects_show+0x28/0x34
   slab_attr_show+0x38/0x54
   sysfs_kf_seq_show+0x198/0x2d4
   kernfs_seq_show+0xa4/0xcc
   seq_read+0x30c/0x8a8
   kernfs_fop_read+0xa8/0x314
   __vfs_read+0x88/0x20c
   vfs_read+0xd8/0x10c
   ksys_read+0xb0/0x120
   __arm64_sys_read+0x54/0x88
   el0_svc_handler+0x170/0x240
   el0_svc+0x8/0xc

I think it is important to mention that this doesn't expose the
show_slab_objects to use-after-free.  There is only a single path that
might really race here and that is the slab hotplug notifier callback
__kmem_cache_shrink (via slab_mem_going_offline_callback) but that path
doesn't really destroy kmem_cache_node data structures.

[1] http://lkml.iu.edu/hypermail/linux/kernel/1101.0/02850.html

[akpm@linux-foundation.org: add comment explaining why we don't need mem_hotplug_lock]
Link: http://lkml.kernel.org/r/1570192309-10132-1-git-send-email-cai@lca.pw
Fixes: 01fb58bcba63 ("slab: remove synchronous synchronize_sched() from memcg cache deactivation path")
Fixes: 03afc0e25f7f ("slab: get_online_mems for kmem_cache_{create,destroy,shrink}")
Signed-off-by: Qian Cai <cai@lca.pw>
Acked-by: Michal Hocko <mhocko@suse.com>
Cc: Christoph Lameter <cl@linux.com>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Tejun Heo <tj@kernel.org>
Cc: Vladimir Davydov <vdavydov.dev@gmail.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:00 -07:00
Vlastimil Babka
fdf3bf8091 mm, page_owner: rename flag indicating that page is allocated
Commit 37389167a281 ("mm, page_owner: keep owner info when freeing the
page") has introduced a flag PAGE_EXT_OWNER_ACTIVE to indicate that page
is tracked as being allocated.  Kirril suggested naming it
PAGE_EXT_OWNER_ALLOCATED to make it more clear, as "active is somewhat
loaded term for a page".

Link: http://lkml.kernel.org/r/20190930122916.14969-4-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Suggested-by: Kirill A. Shutemov <kirill@shutemov.name>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Walter Wu <walter-zh.wu@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:00 -07:00
Vlastimil Babka
0fe9a448a0 mm, page_owner: decouple freeing stack trace from debug_pagealloc
Commit 8974558f49a6 ("mm, page_owner, debug_pagealloc: save and dump
freeing stack trace") enhanced page_owner to also store freeing stack
trace, when debug_pagealloc is also enabled.  KASAN would also like to
do this [1] to improve error reports to debug e.g. UAF issues.

Kirill has suggested that the freeing stack trace saving should be also
possible to be enabled separately from KASAN or debug_pagealloc, i.e.
with an extra boot option.  Qian argued that we have enough options
already, and avoiding the extra overhead is not worth the complications
in the case of a debugging option.  Kirill noted that the extra stack
handle in struct page_owner requires 0.1% of memory.

This patch therefore enables free stack saving whenever page_owner is
enabled, regardless of whether debug_pagealloc or KASAN is also enabled.
KASAN kernels booted with page_owner=on will thus benefit from the
improved error reports.

[1] https://bugzilla.kernel.org/show_bug.cgi?id=203967

[vbabka@suse.cz: v3]
  Link: http://lkml.kernel.org/r/20191007091808.7096-3-vbabka@suse.cz
Link: http://lkml.kernel.org/r/20190930122916.14969-3-vbabka@suse.cz
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reviewed-by: Qian Cai <cai@lca.pw>
Suggested-by: Dmitry Vyukov <dvyukov@google.com>
Suggested-by: Walter Wu <walter-zh.wu@mediatek.com>
Suggested-by: Andrey Ryabinin <aryabinin@virtuozzo.com>
Suggested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Suggested-by: Qian Cai <cai@lca.pw>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:00 -07:00
Vlastimil Babka
5556cfe8d9 mm, page_owner: fix off-by-one error in __set_page_owner_handle()
Patch series "followups to debug_pagealloc improvements through
page_owner", v3.

These are followups to [1] which made it to Linus meanwhile.  Patches 1
and 3 are based on Kirill's review, patch 2 on KASAN request [2].  It
would be nice if all of this made it to 5.4 with [1] already there (or
at least Patch 1).

This patch (of 3):

As noted by Kirill, commit 7e2f2a0cd17c ("mm, page_owner: record page
owner for each subpage") has introduced an off-by-one error in
__set_page_owner_handle() when looking up page_ext for subpages.  As a
result, the head page page_owner info is set twice, while for the last
tail page, it's not set at all.

Fix this and also make the code more efficient by advancing the page_ext
pointer we already have, instead of calling lookup_page_ext() for each
subpage.  Since the full size of struct page_ext is not known at compile
time, we can't use a simple page_ext++ statement, so introduce a
page_ext_next() inline function for that.

Link: http://lkml.kernel.org/r/20190930122916.14969-2-vbabka@suse.cz
Fixes: 7e2f2a0cd17c ("mm, page_owner: record page owner for each subpage")
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Reported-by: Kirill A. Shutemov <kirill@shutemov.name>
Reported-by: Miles Chen <miles.chen@mediatek.com>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Cc: Walter Wu <walter-zh.wu@mediatek.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 15:04:00 -07:00
Sven Schnelle
c32c47aa36 parisc: Remove 32-bit DMA enforcement from sba_iommu
This breaks booting from sata_sil24 with the recent DMA change.
According to James Bottomley this was in to improve performance by
kicking the device into 32 bit descriptors, which are usually more
efficient, especially with older dual descriptor format cards like we
have on parisc systems.
Remove it for now to make DMA working again.

Fixes: dcc02c19cc06 ("sata_sil24: use dma_set_mask_and_coherent")
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Signed-off-by: Helge Deller <deller@gmx.de>
2019-10-14 21:44:26 +02:00
Helge Deller
513f7f747e parisc: Fix vmap memory leak in ioremap()/iounmap()
Sven noticed that calling ioremap() and iounmap() multiple times leads
to a vmap memory leak:
	vmap allocation for size 4198400 failed:
	use vmalloc=<size> to increase size

It seems we missed calling vunmap() in iounmap().

Signed-off-by: Helge Deller <deller@gmx.de>
Noticed-by: Sven Schnelle <svens@stackframe.org>
Cc: <stable@vger.kernel.org> # v3.16+
2019-10-14 21:44:17 +02:00
Nick Desaulniers
0703ad217e parisc: prefer __section from compiler_attributes.h
Reported-by: Sedat Dilek <sedat.dilek@gmail.com>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Nick Desaulniers <ndesaulniers@google.com>
Signed-off-by: Helge Deller <deller@gmx.de>
2019-10-14 21:44:07 +02:00
Helge Deller
b67114db64 parisc: sysctl.c: Use CONFIG_PARISC instead of __hppa_ define
Signed-off-by: Helge Deller <deller@gmx.de>
2019-10-14 21:43:54 +02:00
Jean Delvare
81dde26de9 firmware: dmi: Fix unlikely out-of-bounds read in save_mem_devices
Before reading the Extended Size field, we should ensure it fits in
the DMI record. There is already a record length check but it does
not cover that field.

It would take a seriously corrupted DMI table to hit that bug, so no
need to worry, but we should still fix it.

Signed-off-by: Jean Delvare <jdelvare@suse.de>
Fixes: 6deae96b42eb ("firmware, DMI: Add function to look up a handle and return DIMM size")
Cc: Tony Luck <tony.luck@intel.com>
Cc: Borislav Petkov <bp@suse.de>
2019-10-14 21:41:24 +02:00
Catalin Marinas
2abd839aa7 kmemleak: Do not corrupt the object_list during clean-up
In case of an error (e.g. memory pool too small), kmemleak disables
itself and cleans up the already allocated metadata objects. However, if
this happens early before the RCU callback mechanism is available,
put_object() skips call_rcu() and frees the object directly. This is not
safe with the RCU list traversal in __kmemleak_do_cleanup().

Change the list traversal in __kmemleak_do_cleanup() to
list_for_each_entry_safe() and remove the rcu_read_{lock,unlock} since
the kmemleak is already disabled at this point. In addition, avoid an
unnecessary metadata object rb-tree look-up since it already has the
struct kmemleak_object pointer.

Fixes: c5665868183f ("mm: kmemleak: use the memory pool for early allocations")
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reported-by: Marc Dionne <marc.c.dionne@gmail.com>
Reported-by: Ted Ts'o <tytso@mit.edu>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2019-10-14 08:56:16 -07:00
Andy Shevchenko
832392db97 platform/x86: i2c-multi-instantiate: Fail the probe if no IRQ provided
For APIC case of interrupt we don't fail a ->probe() of the driver,
which makes kernel to print a lot of warnings from the children.

We have two options here:
- switch to platform_get_irq_optional(), though it won't stop children
  to be probed and failed
- fail the ->probe() of i2c-multi-instantiate

Since the in reality we never had devices in the wild where IRQ resource
is optional, the latter solution suits the best.

Fixes: 799d3379a672 ("platform/x86: i2c-multi-instantiate: Introduce IOAPIC IRQ support")
Reported-by: Ammy Yi <ammy.yi@intel.com>
Cc: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Cc: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
2019-10-14 15:31:50 +03:00
Linus Torvalds
4f5cafb5cb Linux 5.4-rc3 v5.4-rc3 2019-10-13 16:37:36 -07:00
Linus Torvalds
d4615e5a46 A few tracing fixes:
- Removed locked down from tracefs itself and moved it to the trace
    directory. Having the open functions there do the lockdown checks.
 
  - Fixed a few races with opening an instance file and the instance being
    deleted (Discovered during the locked down updates). Kept separate
    from the clean up code such that they can be backported to stable
    easier.
 
  - Cleaned up and consolidated the checks done when opening a trace
    file, as there were multiple checks that need to be done, and it
    did not make sense having them done in each open instance.
 
  - Fixed a regression in the record mcount code.
 
  - Small hw_lat detector tracer fixes.
 
  - A trace_pipe read fix due to not initializing trace_seq.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCXaNhphQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6quDIAP4v08ARNdIh+r+c4AOBm3xsOuE/d9GB
 I56ydnskm+x2JQD6Ap9ivXe9yDBIErFeHNtCoq7pM8YDI4YoYIB30N0GfwM=
 =7oAu
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing fixes from Steven Rostedt:
 "A few tracing fixes:

   - Remove lockdown from tracefs itself and moved it to the trace
     directory. Have the open functions there do the lockdown checks.

   - Fix a few races with opening an instance file and the instance
     being deleted (Discovered during the lockdown updates). Kept
     separate from the clean up code such that they can be backported to
     stable easier.

   - Clean up and consolidated the checks done when opening a trace
     file, as there were multiple checks that need to be done, and it
     did not make sense having them done in each open instance.

   - Fix a regression in the record mcount code.

   - Small hw_lat detector tracer fixes.

   - A trace_pipe read fix due to not initializing trace_seq"

* tag 'trace-v5.4-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  tracing: Initialize iter->seq after zeroing in tracing_read_pipe()
  tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency
  tracing/hwlat: Report total time spent in all NMIs during the sample
  recordmcount: Fix nop_mcount() function
  tracing: Do not create tracefs files if tracefs lockdown is in effect
  tracing: Add locked_down checks to the open calls of files created for tracefs
  tracing: Add tracing_check_open_get_tr()
  tracing: Have trace events system open call tracing_open_generic_tr()
  tracing: Get trace_array reference for available_tracers files
  ftrace: Get a reference counter for the trace_array on filter files
  tracefs: Revert ccbd54ff54e8 ("tracefs: Restrict tracefs when the kernel is locked down")
2019-10-13 14:47:10 -07:00
Linus Torvalds
2581efa9a4 hwmon fixes for v5.4-rc3
Update/fix inspur-ipsps1 and k10temp Documentation
 Fix nct7904 driver
 Fix HWMON_P_MIN_ALARM mask in hwmon core
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiHPvMQj9QTOCiqgVyx8mb86fmYEFAl2iD0IACgkQyx8mb86f
 mYEnFQ//XJMPIukj659J3hwfzxEx3LXucQsQivUhmVXYzPAzdxa9eC97tj1I3pgz
 nKR2BIFvXtI+o3459TWuvj+Y7utmQXRQ5XiP6JBb+Wkn84IK6O0GNKSOPYDNI/3t
 /ii2XY3dERr8BuBPmRuYP7GorjFJH1P7B0FXrln5HLYAyS1yJVioLVhbrbi63NkL
 ngoxZhgcVP5QIbaSFC6ZJs1oqXXVIbvBEkt9bjkmDhJssmPtFweVxvIUnWUoPdAg
 A/GvmOddhRa/9vXzZbNThOFOLq7nHguIDvbIwizULMRHfDHB5aFEqyz9W/OFqSj5
 8s+GRiO1oYBxVrz4Bl32pR0aCJbZ5WlxhgKrLdlunRnATAzXpLzRAbKhOhoBchJZ
 xyC+96q0x8GIJ5dBqm2XqZhJssZgfgb/6SSlzSu+r6aQogHQc1/PDL26PHR517E5
 T/pUNY97LvOOgiosDuaLiGpiszoz0WdeHVqdUujnRiN5TtE5u0M6ACdumXGjFb8K
 0sPkep919yaaKNaPYA6u9PCHl4WG5EJdBkY44z4BkCt9scVvXEvATGUxNMwH8Bvm
 95w/swRlN2yq+fa4T4I3UtAAW9EgCJ3LIYYUo1me0SDld2cPwbVkMvOIirD3TBLC
 qoP178X34HaqeozOaW4Hi08QYLzktmCo5KFm6KKNFbFCQZ4oBo4=
 =d+np
 -----END PGP SIGNATURE-----

Merge tag 'hwmon-for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging

Pull hwmon fixes from Guenter Roeck:

 - Update/fix inspur-ipsps1 and k10temp Documentation

 - Fix nct7904 driver

 - Fix HWMON_P_MIN_ALARM mask in hwmon core

* tag 'hwmon-for-v5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging:
  hwmon: docs: Extend inspur-ipsps1 title underline
  hwmon: (nct7904) Add array fan_alarm and vsen_alarm to store the alarms in nct7904_data struct.
  docs: hwmon: Include 'inspur-ipsps1.rst' into docs
  hwmon: Fix HWMON_P_MIN_ALARM mask
  hwmon: (k10temp) Update documentation and add temp2_input info
  hwmon: (nct7904) Fix the incorrect value of vsen_mask in nct7904_data struct
2019-10-13 08:40:31 -07:00
Linus Torvalds
71b1b5532b This pull request contains two fixes for MTD:
- spi-nor: Fix for a regression in write_sr()
 - rawnand: Regression fix for the au1550nd driver
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEdgfidid8lnn52cLTZvlZhesYu8EFAl2iNXoWHHJpY2hhcmRA
 c2lnbWEtc3Rhci5hdAAKCRBm+VmF6xi7wXBVD/sH9DCWSFAUXwqnXG1GP/05R++q
 0OHc8t8DmTxcsNXMmPUhj/icQHRf611QvH9T3oOKCn6umOeuDskB89H4ELDnk9kg
 ncxW48ijCRT4RNvL0FPbsbL3eU22kRtLoHj9VSa4tSjeNzgdbBG5sdugmASCNByi
 WLyNY6nXZG2v8lpRwP/uv6zDtlp3r7ucNVdKlKWTsdBzpL94LIkwYgsWg/TTG6eO
 9/29DBEjQonu3kCMkxKoHP1OVfS+XYKnbg4W6MNyDFRyT6xJxgt191K4q735ZiCJ
 Qf+rjUMo9/7WTUdSlAept3tb/0iKAV6V8Ql8xBGfLsGMsa7O6eh083d+9QvjXMw9
 I3dwsb9Of7t2n6Cj9pT74k6UA3KSfkM4v7r87Gr4UkyWxNn5uIn9uy/WJ+yCHFUd
 1aDC6ZR8dzzsJNoWx6E4zFtBtu+Whd9/V18qTGegjg/eFK51z4Dh6bDkyKveYV5C
 3343fuYQ0KO4wsFyiUcraP8BQcxk2FkjMcWS3bX3tbJuux8GV9UwfKSdlICwEgw+
 Rql4N4SKZWoLg7d1tXOQ9fGikBel0mwQOvgABJoVOq13doUTesdiVO2JyUIblkvP
 cA4w1NDgjXZ+kMh0n6aR+kJZcHd751/EoO0mk8tt6JNT80NWottr4CpzSNx+MbAW
 YgYzi/q35JnIV5Ec4Q==
 =mJlc
 -----END PGP SIGNATURE-----

Merge tag 'fixes-for-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux

Pull MTD fixes from Richard Weinberger:
 "Two fixes for MTD:

   - spi-nor: Fix for a regression in write_sr()

   - rawnand: Regression fix for the au1550nd driver"

* tag 'fixes-for-5.4-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/mtd/linux:
  mtd: rawnand: au1550nd: Fix au_read_buf16() prototype
  mtd: spi-nor: Fix direction of the write_sr() transfer
2019-10-13 08:26:54 -07:00
Linus Torvalds
b27528b027 for-linus-20191012
-----BEGIN PGP SIGNATURE-----
 
 iQJEBAABCAAuFiEEwPw5LcreJtl1+l5K99NY+ylx4KYFAl2iBvAQHGF4Ym9lQGtl
 cm5lbC5kawAKCRD301j7KXHgpvokEACaUozc0Y2L99CqJcRtoH8cwIAimjKTFSmi
 UKijO/w11uc9xCIOospkM8iseRbpP/oCIwfYyVROi4LniPB1TSyQJoLF/MELeqIi
 lOSg8RrXzC22h5+QG3BR5VO55me3MwIXmJLMMWXbh7FAuUg+wepVwHwzZBurd1Pz
 C1wG8NU758AmqJdSGclUbdLLwTBkR+fXoi2EHfdbsBMqGmdZ5VRzCLx70grDcxq2
 S9gt+2+diME6o6p1wko+m9pdhUdEy7epPE50K4SA+DU06I77OZpTdSAPilPyQHei
 +3HDslzjCyos5/Ay+7hWl42jvyZ+x6GI5VcDLMTcMZsmxLy/WObDB4zcAixJrDV0
 c7XWnwCCs2wMXYMsu0ASbkU32mcQ/Ik8CrX0W3/tP1aZBW2LZnOHvXYbybzpbgbo
 QY47t3jz21uzF5kplBGxqaY1sPmLxp5V0vQmj9eN+Y7T+hiR/UgWKO/5B4CGNJe0
 BQkXNl5cZAtqxmSE8AxlpOla7z6beZUljdhFc7K8cMDdhoatRgm2MBh2G9BB1ayo
 VwHc1qLcvInpQrGiC4l5H/F8jrEs/PglknFSs1wlsav1gXA69J0cHA7LY6Czweaw
 mkiKr7jA6duBBn6MRUThF5sBMFWYuxp3B/BA4wfDJsSfCmyhWN+OZlsDaEVrzk7o
 /XK2Ew9R2Q==
 =PUeL
 -----END PGP SIGNATURE-----

Merge tag 'for-linus-20191012' of git://git.kernel.dk/linux-block

Pull io_uring fix from Jens Axboe:
 "Single small fix for a regression in the sequence logic for linked
  commands"

* tag 'for-linus-20191012' of git://git.kernel.dk/linux-block:
  io_uring: fix sequence logic for timeout requests
2019-10-13 08:15:35 -07:00
Michael S. Tsirkin
245cdd9fbd vhost/test: stop device before reset
When device stop was moved out of reset, test device wasn't updated to
stop before reset, this resulted in a use after free.  Fix by invoking
stop appropriately.

Fixes: b211616d7125 ("vhost: move -net specific code out")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-13 09:38:27 -04:00
Michael S. Tsirkin
edc5774c09 tools/virtio: xen stub
Fixes test module build.

Reported-by: Jan Kiszka <jan.kiszka@web.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2019-10-13 09:38:27 -04:00
Petr Mladek
d303de1fcf tracing: Initialize iter->seq after zeroing in tracing_read_pipe()
A customer reported the following softlockup:

[899688.160002] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [test.sh:16464]
[899688.160002] CPU: 0 PID: 16464 Comm: test.sh Not tainted 4.12.14-6.23-azure #1 SLE12-SP4
[899688.160002] RIP: 0010:up_write+0x1a/0x30
[899688.160002] Kernel panic - not syncing: softlockup: hung tasks
[899688.160002] RIP: 0010:up_write+0x1a/0x30
[899688.160002] RSP: 0018:ffffa86784d4fde8 EFLAGS: 00000257 ORIG_RAX: ffffffffffffff12
[899688.160002] RAX: ffffffff970fea00 RBX: 0000000000000001 RCX: 0000000000000000
[899688.160002] RDX: ffffffff00000001 RSI: 0000000000000080 RDI: ffffffff970fea00
[899688.160002] RBP: ffffffffffffffff R08: ffffffffffffffff R09: 0000000000000000
[899688.160002] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8b59014720d8
[899688.160002] R13: ffff8b59014720c0 R14: ffff8b5901471090 R15: ffff8b5901470000
[899688.160002]  tracing_read_pipe+0x336/0x3c0
[899688.160002]  __vfs_read+0x26/0x140
[899688.160002]  vfs_read+0x87/0x130
[899688.160002]  SyS_read+0x42/0x90
[899688.160002]  do_syscall_64+0x74/0x160

It caught the process in the middle of trace_access_unlock(). There is
no loop. So, it must be looping in the caller tracing_read_pipe()
via the "waitagain" label.

Crashdump analyze uncovered that iter->seq was completely zeroed
at this point, including iter->seq.seq.size. It means that
print_trace_line() was never able to print anything and
there was no forward progress.

The culprit seems to be in the code:

	/* reset all but tr, trace, and overruns */
	memset(&iter->seq, 0,
	       sizeof(struct trace_iterator) -
	       offsetof(struct trace_iterator, seq));

It was added by the commit 53d0aa773053ab182877 ("ftrace:
add logic to record overruns"). It was v2.6.27-rc1.
It was the time when iter->seq looked like:

     struct trace_seq {
	unsigned char		buffer[PAGE_SIZE];
	unsigned int		len;
     };

There was no "size" variable and zeroing was perfectly fine.

The solution is to reinitialize the structure after or without
zeroing.

Link: http://lkml.kernel.org/r/20191011142134.11997-1-pmladek@suse.com

Signed-off-by: Petr Mladek <pmladek@suse.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-10-12 20:49:34 -04:00
Srivatsa S. Bhat (VMware)
fc64e4ad80 tracing/hwlat: Don't ignore outer-loop duration when calculating max_latency
max_latency is intended to record the maximum ever observed hardware
latency, which may occur in either part of the loop (inner/outer). So
we need to also consider the outer-loop sample when updating
max_latency.

Link: http://lkml.kernel.org/r/157073345463.17189.18124025522664682811.stgit@srivatsa-ubuntu

Fixes: e7c15cd8a113 ("tracing: Added hardware latency tracer")
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-10-12 20:49:33 -04:00
Srivatsa S. Bhat (VMware)
98dc19c114 tracing/hwlat: Report total time spent in all NMIs during the sample
nmi_total_ts is supposed to record the total time spent in *all* NMIs
that occur on the given CPU during the (active portion of the)
sampling window. However, the code seems to be overwriting this
variable for each NMI, thereby only recording the time spent in the
most recent NMI. Fix it by accumulating the duration instead.

Link: http://lkml.kernel.org/r/157073343544.17189.13911783866738671133.stgit@srivatsa-ubuntu

Fixes: 7b2c86250122 ("tracing: Add NMI tracing in hwlat detector")
Cc: stable@vger.kernel.org
Signed-off-by: Srivatsa S. Bhat (VMware) <srivatsa@csail.mit.edu>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-10-12 20:49:33 -04:00
Steven Rostedt (VMware)
7f8557b88d recordmcount: Fix nop_mcount() function
The removal of the longjmp code in recordmcount.c mistakenly made the return
of make_nop() being negative an exit of nop_mcount(). It should not exit the
routine, but instead just not process that part of the code. By exiting with
an error code, it would cause the update of recordmcount to fail some files
which would fail the build if ftrace function tracing was enabled.

Link: http://lkml.kernel.org/r/20191009110538.5909fec6@gandalf.local.home

Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Tested-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Fixes: 3f1df12019f3 ("recordmcount: Rewrite error/success handling")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-10-12 20:49:33 -04:00
Steven Rostedt (VMware)
bf8e602186 tracing: Do not create tracefs files if tracefs lockdown is in effect
If on boot up, lockdown is activated for tracefs, don't even bother creating
the files. This can also prevent instances from being created if lockdown is
in effect.

Link: http://lkml.kernel.org/r/CAHk-=whC6Ji=fWnjh2+eS4b15TnbsS4VPVtvBOwCy1jjEG_JHQ@mail.gmail.com

Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2019-10-12 20:49:07 -04:00