Pull x86 fixes from Ingo Molnar:
"This contains two fixes: a boot fix for older SGI/UV systems, and an
APIC calibration fix"
* 'x86-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/tsc: Read all ratio bits from MSR_PLATFORM_INFO
x86/platform/UV: Bring back the call to map_low_mmrs in uv_system_init
- Fix for a recent regression in the intel_pstate driver causing
it to fail to restore the HWP (HW-managed P-states) configuration
of the boot CPU after suspend-to-RAM (Rafael Wysocki).
- Fix for two recent regressions in the intel_pstate driver, one
that can trigger a divide by zero if the driver is accessed via
sysfs before it manages to take the first sample and one causing
it to fail to update a structure field used in a trace point, so
the information coming from it is less useful (Rafael Wysocki).
- Fix for a problem in the sti-cpufreq driver introduced during
the 4.5 cycle that causes it to break CPU PM in multi-platform
kernels by registering cpufreq-dt (which subsequently doesn't
work) unconditionally and preventing the driver that would
actually work from registering (Sudeep Holla).
- Stable-candidate fix for an ARM64 cpuidle issue causing idle
state usage counters to be incorrectly updated for idle states
that were not entered due to errors (James Morse).
- Fix for a recently introduced issue in the OPP (Operating
Performance Points) framework causing it to print bogus error
messages for missing optional regulators (Viresh Kumar).
- Fix for a recently introduced issue in the generic device
properties framework that may cause it to attempt to dereferece
and invalid pointer in some cases (Heikki Krogerus).
- Fix for a deadlock in the ACPICA core that may be triggered
by device (eg. Thunderbolt) hotplug (Prarit Bhargava).
/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.22 (GNU/Linux)
iQIcBAABCAAGBQJXLIwPAAoJEILEb/54YlRxT+wP/ROEo/r5IaRZ2k8cphWjsiKk
k9eDuWBL2KZ29ikghXs/vVY2fMbtQkaDT5h57imsUEKoEzI3MlYA3OkQyffFOcsY
dz/9EnG6K9Efi6VS1dS1tNCgl45aIeHLCqlVPOBCZ9TwSoAERdNJGqItJdS2YKIA
+C1LGrWl4UiJ95AOof9PHfKfnWxrnRbpIsB2PbxD0Swe5vfskrHoRWGOAMLJIwpF
7NvEJ15fryDIvlMR/ggNrg2L2piOu1fJl2kVZYWZTb/u+qAO3utxTQN4y++zTSNb
LAN78Hq/nJu156SSioO9fLa0wPaU+k2OChfWXtlMsTDK+L5EQz4G3pJwi5FA8QTD
nfeZNC9VgqfP4LtqWw05h/AOw4A0XUeuwB8Edbc+WG5twzULqDhS57jew4A4xX8d
jOsvK5syygnR+/rExWc0NWSmCH0g1u6mCUWXQuocfSb/oOEcUGq5RSixRNRfmJUq
9XNF3hbp7W/Vnp9GWT30Md+CenrEtQXFK8ZQtg0ckBl+b5bEqKYs6FXGqCkUmjZy
Qgt5sqxgdLWtslS3vSu1/mdryeaLmXNO6c6wueSPMmLyYODEoIHSSka9N9O0Inwv
d106p7gUy3/ETamC3lbnyHkUrAru74Qh8rErKpqaRLkKfcIq7YCB073fxbqlamzz
X4n8a1H37LefLqmKwIbF
=pU+A
-----END PGP SIGNATURE-----
Merge tag 'pm+acpi-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management and ACPI fixes from Rafael Wysocki:
"Fixes for problems introduced or discovered recently (intel_pstate,
sti-cpufreq, ARM64 cpuidle, Operating Performance Points framework,
generic device properties framework) and one fix for a hotplug-related
deadlock in ACPICA that's been there forever, but is nasty enough.
Specifics:
- Fix for a recent regression in the intel_pstate driver causing it
to fail to restore the HWP (HW-managed P-states) configuration of
the boot CPU after suspend-to-RAM (Rafael Wysocki).
- Fix for two recent regressions in the intel_pstate driver, one that
can trigger a divide by zero if the driver is accessed via sysfs
before it manages to take the first sample and one causing it to
fail to update a structure field used in a trace point, so the
information coming from it is less useful (Rafael Wysocki).
- Fix for a problem in the sti-cpufreq driver introduced during the
4.5 cycle that causes it to break CPU PM in multi-platform kernels
by registering cpufreq-dt (which subsequently doesn't work)
unconditionally and preventing the driver that would actually work
from registering (Sudeep Holla).
- Stable-candidate fix for an ARM64 cpuidle issue causing idle state
usage counters to be incorrectly updated for idle states that were
not entered due to errors (James Morse).
- Fix for a recently introduced issue in the OPP (Operating
Performance Points) framework causing it to print bogus error
messages for missing optional regulators (Viresh Kumar).
- Fix for a recently introduced issue in the generic device
properties framework that may cause it to attempt to dereferece and
invalid pointer in some cases (Heikki Krogerus).
- Fix for a deadlock in the ACPICA core that may be triggered by
device (eg Thunderbolt) hotplug (Prarit Bhargava)"
* tag 'pm+acpi-4.6-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm:
PM / OPP: Remove useless check
ACPICA: Dispatcher: Update thread ID for recursive method calls
intel_pstate: Fix intel_pstate_get()
cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
cpufreq: st: enable selective initialization based on the platform
ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
device property: Avoid potential dereferences of invalid pointers
Pull scheduler fix from Ingo Molnar:
"This contains a single fix that fixes a nohz tick stopping bug when
mixed-poliocy SCHED_FIFO and SCHED_RR tasks are present on a runqueue"
* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
nohz/full, sched/rt: Fix missed tick-reenabling bug in sched_can_stop_tick()
Pull perf fixes from Ingo Molnar:
"This tree contains two fixes: new Intel CPU model numbers and an
AMD/iommu uncore PMU driver fix"
* 'perf-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
perf/x86/amd/iommu: Do not register a task ctx for uncore like PMUs
perf/x86: Add model numbers for Kabylake CPUs
Pull EFI fixes from Ingo Molnar:
"This tree contains three fixes: a console spam fix, a file pattern fix
and a sysfb_efi fix for a bug that triggered on older ThinkPads"
* 'efi-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/sysfb_efi: Fix valid BAR address range check
x86/efi-bgrt: Switch all pr_err() to pr_notice() for invalid BGRT
MAINTAINERS: Remove asterisk from EFI directory names
Pull parisc fix from Helge Deller:
"Patch from Dmitry V Levin to fix a kernel crash when a straced process
calls the (invalid) syscall which is equal to value of __NR_Linux_syscalls"
* 'parisc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/deller/parisc-linux:
parisc: fix a bug when syscall number of tracee is __NR_Linux_syscalls
- Fix for PTE truncation in PAE40 builds
- Fix for big endian IO accessors lacking IO barrier
- Allow HIGHMEM to work with low physical addresses
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXLIohAAoJEGnX8d3iisJeXLcQALwte9aeySA+r1X7S/OkW6uw
kU+kAYqjxnjgIy4Zy+BmQwFMTxJpvb7NV9ZHvLWNFXWlyZEBAbDQz8LgLpuAekx4
nbyk5KvvSKKkWP19T0/pCr19mQC06qKWzEr91zZ/lqxDwZKR6GIlbnlT7ugB25gH
9dKSUww17YsS2nVvlq0d0nbs+rFP/4fE9O8RW096bo6zOi5j6dzP7jJy6AdG1Czq
+UQUcFdkIkJLwoOaamur49azNmpmdR8goOgz6kzj8NK2J/cQx5698YTArR9c2xSw
86TAr1yw4gnz9M+vinNrfY+mxs+n3YZTG//tVbuhL5nvGKDM6T1lkuyJPbiJHHDQ
NxULX60NYOP32Xl5/fnCGoIDL30rLwmCgn1T2uZONVnx9e0Ai51TJE3T7Qlux/3X
t/BlqzWQRIWOfR6jAtS+Gi3KM4fRwRNAeU4ed/kUqyXQSoRUNXOPof35+h7O7dSO
dvrUnhBtBXsHCaDRRLfxjDzNqV9T2aIr/zZLM3rkBw79SPVIBHdV1hjLt0IxpMrn
ttwhjVQXahD+MUGxLqS7efwdyHzjtOB/D1wW/6Fg6n/ircvTcIziV4pThHCFTxs9
zWgmGgtD2XWSj0fhNXqe+AyjW1a6ZiH3GjMfV9cBbAsydFmUDCws8kWMTQOUSRNw
1l/L0DIsBZzGlnredJzJ
=bf8d
-----END PGP SIGNATURE-----
Merge tag 'arc-4.6-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
Pull ARC fixes from Vineet Gupta:
"Late in the cycle, but this has fixes for couple of issues: a PAE40
boot crash and Arnd spotting lack of barriers in BE io-accessors.
The 3rd patch for enabling highmem in low physical mem ;-) honestly is
more than a "fix" but its been in works for some time, seems to be
stable in testing and enables 2 of our customers to go forward with
4.6 kernel.
- Fix for PTE truncation in PAE40 builds
- Fix for big endian IO accessors lacking IO barrier
- Allow HIGHMEM to work with low physical addresses"
* tag 'arc-4.6-rc7-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc:
ARC: support HIGHMEM even without PAE40
ARC: Fix PAE40 boot failures due to PTE truncation
ARC: Add missing io barriers to io{read,write}{16,32}be()
- Fix bad inline asm constraint in create_zero_mask() from Anton Blanchard
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXLEx6AAoJEFHr6jzI4aWAQskQAL9rtQtxhJQnCE97832iIgWn
+YEef53mSTX7CEpoQqGY/2PUAOeQ8MPteewNaO5spZC2Vv34bXuHtv0q1uhiVJ6K
Q0t1LSbm60rv9BIuO8rDSJmSGnSljI1bczd0WVsTHuYLyFrVPtRZaDBJT9Bh2JTQ
+K/GE3o4R1X5pnUlZ4UqCcN7KwLrqHkQrmEcsSdUQMJ4s4jZaxS1KP9Kzq/M+mjy
L9K4SyXneMyCgiUCIdu3hN5A7Vb+2NauGmCP/eV6x4+M/Bq5JC4vNDUcCKE0IBLF
bNP5nvMKBBZOHlwxVWo9d2UWPeyEoIz4sKiZ26AE+urCYGhRMHb4M/Mh0qUbru13
CwZ1BwmZnyYs4pY8INbZJ+TvLcFgROfxxeUlT0KHStlnOrIbmgeKPWI/nymsL6n8
uHgSc8SnhskNW0n4NhWKVPD5iKqeWV2spFr5aTQ2tLDi7s2rT2oZjl5/GvKSIt+o
NlmkL2LpbIOftRQ2j+EhHPq/acV8+d20yjN8Zul0tSTNuFeSk6rXyXhEiYXfU+Ao
lPjkheNJZ9Oq0MAW0dvqWK4DrK31lhnZYcauO84cSA39sGa9BhHFe1E9XXMDSDwf
YFs7ihMcbap1pejcf7waJJ5axCwN1SgxUGH1lCuGmOAypftbQT5TpB7m52ctAS5J
vbsUZ8Z0W6OwsfSiKc/G
=IkNh
-----END PGP SIGNATURE-----
Merge tag 'powerpc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fix from Michael Ellerman:
"Fix bad inline asm constraint in create_zero_mask() from Anton
Blanchard"
* tag 'powerpc-4.6-5' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc: Fix bad inline asm constraint in create_zero_mask()
Pull drm fixes from Dave Airlie:
"Fixes for i915, amdgpu/radeon and imx.
The IMX fix is for an autoloading regression found in Fedora. The
radeon fixes, are the same fix to amdgpu/radeon to avoid a hardware
lockup in some circumstances with a bad mode, and a double free bug I
took a few hours chasing down the other morning.
The i915 fixes are across the board, all stable material, and fixing
some hangs and suspend/resume issues, along with a live status
regressions"
* 'drm-fixes' of git://people.freedesktop.org/~airlied/linux:
gpu: ipu-v3: Fix imx-ipuv3-crtc module autoloading
drm/amdgpu: make sure vertical front porch is at least 1
drm/radeon: make sure vertical front porch is at least 1
drm/amdgpu: set metadata pointer to NULL after freeing.
drm/i915: Make RPS EI/thresholds multiple of 25 on SNB-BDW
drm/i915: Fake HDMI live status
drm/i915: Fix eDP low vswing for Broadwell
drm/i915/ddi: Fix eDP VDD handling during booting and suspend/resume
drm/i915: Fix system resume if PCI device remained enabled
drm/i915: Avoid stalling on pending flips for legacy cursor updates
During a mount, we start the cleaner kthread first because the transaction
kthread wants to wake up the cleaner kthread. We start the transaction
kthread next because everything in btrfs wants transactions. We do reloc
recovery in the thread that was doing the original mount call once the
transaction kthread is running. This means that the cleaner kthread
could already be running when reloc recovery happens (e.g. if a snapshot
delete was started before a crash).
Relocation does not play well with the cleaner kthread, so a mutex was
added in commit 5f3164813b90f7dbcb5c3ab9006906222ce471b7 "Btrfs: fix
race between balance recovery and root deletion" to prevent both from
being active at the same time.
If the cleaner kthread is already holding the mutex by the time we get
to btrfs_recover_relocation, the mount will be blocked until at least
one deleted subvolume is cleaned (possibly more if the mount process
doesn't get the lock right away). During this time (which could be an
arbitrarily long time on a large/slow filesystem), the mount process is
stuck and the filesystem is unnecessarily inaccessible.
Fix this by locking cleaner_mutex before we start cleaner_kthread, and
unlocking the mutex after mount no longer requires it. This ensures
that the mounting process will not be blocked by the cleaner kthread.
The cleaner kthread is already prepared for mutex contention and will
just go to sleep until the mutex is available.
Signed-off-by: Zygo Blaxell <ce3g8jdj@umail.furryterror.org>
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
pagev array in scrub_block{} is of size SCRUB_MAX_PAGES_PER_BLOCK.
page_index should be checked with the same to trigger BUG_ON().
Signed-off-by: Ashish Samant <ashish.samant@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
btrfs_map_block can go horribly wrong in the face of fs corruption, lets agree
to not be assholes and panic at any possible chance things are all fucked up.
Signed-off-by: Josef Bacik <jbacik@fb.com>
[ removed type casts ]
Signed-off-by: David Sterba <dsterba@suse.com>
The struct 'map_lookup' uses type int for @stripe_len, while
btrfs_chunk_stripe_len() can return a u64 value, and it may end up with
@stripe_len being undefined value and it can lead to 'divide error' in
__btrfs_map_block().
This changes 'map_lookup' to use type u64 for stripe_len, also right now
we only use BTRFS_STRIPE_LEN for stripe_len, so this adds a valid checker for
BTRFS_STRIPE_LEN.
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Reported-by: Quentin Casasnovas <quentin.casasnovas@oracle.com>
Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
[ folded division fix to scrub_raid56_parity ]
Signed-off-by: David Sterba <dsterba@suse.com>
If the label setting ioctl races with sysfs label handler, we could get
mixed result in the output, part old part new. We should either get the
old or new label. The chances to hit this race are low.
Signed-off-by: David Sterba <dsterba@suse.com>
Add a sanity check for the fs_info as we will dereference it, similar to
what the 'store features' handler does.
Signed-off-by: David Sterba <dsterba@suse.com>
The key variable occupies 17 bytes, the key_start is used once, we can
simply reuse existing 'key' for that purpose. As the key is not a simple
type, compiler doest not do it on itself.
Signed-off-by: David Sterba <dsterba@suse.com>
The size of root item is more than 400 bytes, which is quite a lot of
stack space. As we do IO from inside the subvolume ioctls, we should
keep the stack usage low in case the filesystem is on top of other
layers (NFS, device mapper, iscsi, etc).
Reviewed-by: Tsutomu Itoh <t-itoh@jp.fujitsu.com>
Signed-off-by: David Sterba <dsterba@suse.com>
The "sizeof(*arg->clone_sources) * arg->clone_sources_count" expression
can overflow. It causes several static checker warnings. It's all
under CAP_SYS_ADMIN so it's not that serious but lets silence the
warnings.
Signed-off-by: Dan Carpenter <dan.carpenter@oracle.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Since mixed block groups accounting isn't byte-accurate and f_bree is an
unsigned integer, it could overflow. Avoid this.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Suggested-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Metadata for mixed block is already accounted in total data and should not
be counted as part of the free metadata space.
Signed-off-by: Luis de Bethencourt <luisbg@osg.samsung.com>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=114281
Signed-off-by: David Sterba <dsterba@suse.com>
Currently, we don't allow the user to try and rebalance to a dup profile
on a multi-device filesystem. In most cases, this is a perfectly sensible
restriction as raid1 uses the same amount of space and provides better
protection.
However, when reshaping a multi-device filesystem down to a single device
filesystem, this requires the user to convert metadata and system chunks
to single profile before deleting devices, and then convert again to dup,
which leaves a period of time where metadata integrity is reduced.
This patch removes the single-device-only restriction from converting to
dup profile to remove this potential data integrity reduction.
Signed-off-by: Austin S. Hemmelgarn <ahferroin7@gmail.com>
Signed-off-by: David Sterba <dsterba@suse.com>
Do not load one entry beyond the end of the syscall table when the
syscall number of a traced process equals to __NR_Linux_syscalls.
Similar bug with regular processes was fixed by commit 3bb457af4fa8
("[PARISC] Fix bug when syscall nr is __NR_Linux_syscalls").
This bug was found by strace test suite.
Cc: stable@vger.kernel.org
Signed-off-by: Dmitry V. Levin <ldv@altlinux.org>
Acked-by: Helge Deller <deller@gmx.de>
Signed-off-by: Helge Deller <deller@gmx.de>
It seems to be long time unused, since 2008 and
6885f308b5570 ("Btrfs: Misc 2.6.25 updates").
Propagating the removal touches some code but has no functional effect.
Signed-off-by: David Sterba <dsterba@suse.com>
* pm-opp-fixes:
PM / OPP: Remove useless check
* pm-cpufreq-fixes:
intel_pstate: Fix intel_pstate_get()
cpufreq: intel_pstate: Fix HWP on boot CPU after system resume
cpufreq: st: enable selective initialization based on the platform
* pm-cpuidle-fixes:
ARM: cpuidle: Pass on arm_cpuidle_suspend()'s return value
Currently we read the tsc radio: ratio = (MSR_PLATFORM_INFO >> 8) & 0x1f;
Thus we get bit 8-12 of MSR_PLATFORM_INFO, however according to the SDM
(35.5), the ratio bits are bit 8-15.
Ignoring the upper bits can result in an incorrect tsc ratio, which causes the
TSC calibration and the Local APIC timer frequency to be incorrect.
Fix this problem by masking 0xff instead.
[ tglx: Massaged changelog ]
Fixes: 7da7c1561366 "x86, tsc: Add static (MSR) TSC calibration on Intel Atom SoCs"
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: stable@vger.kernel.org
Cc: Bin Gao <bin.gao@intel.com>
Cc: Len Brown <lenb@kernel.org>
Link: http://lkml.kernel.org/r/1462505619-5516-1-git-send-email-yu.c.chen@intel.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Merge fixes from Andrew Morton:
"14 fixes"
* emailed patches from Andrew Morton <akpm@linux-foundation.org>:
byteswap: try to avoid __builtin_constant_p gcc bug
lib/stackdepot: avoid to return 0 handle
mm: fix kcompactd hang during memory offlining
modpost: fix module autoloading for OF devices with generic compatible property
proc: prevent accessing /proc/<PID>/environ until it's ready
mm/zswap: provide unique zpool name
mm: thp: kvm: fix memory corruption in KVM with THP enabled
MAINTAINERS: fix Rajendra Nayak's address
mm, cma: prevent nr_isolated_* counters from going negative
mm: update min_free_kbytes from khugepaged after core initialization
huge pagecache: mmap_sem is unlocked when truncation splits pmd
rapidio/mport_cdev: fix uapi type definitions
mm: memcontrol: let v2 cgroups follow changes in system swappiness
mm: thp: correct split_huge_pages file permission
Pull libnvdimm fixes from Dan Williams:
- a fix for the persistent memory 'struct page' driver. The
implementation overlooked the fact that pages are allocated in 2MB
units leading to -ENOMEM when establishing some configurations.
It's tagged for -stable as the problem was introduced with the
initial implementation in 4.5.
- The new "error status translation" routine, introduced with the 4.6
updates to the nfit driver, missed a necessary path in
acpi_nfit_ctl().
The end result is that we are falsely assuming commands complete
successfully when the embedded status says otherwise.
* 'libnvdimm-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/nvdimm/nvdimm:
nfit: fix translation of command status results
libnvdimm, pfn: fix memmap reservation sizing
This is another attempt to avoid a regression in wwn_to_u64() after that
started using get_unaligned_be64(), which in turn ran into a bug on
gcc-4.9 through 6.1.
The regression got introduced due to the combination of two separate
workarounds (commits e3bde9568d99: "include/linux/unaligned: force
inlining of byteswap operations" and ef3fb2422ffe: "scsi: fc: use
get/put_unaligned64 for wwn access") that each try to sidestep distinct
problems with gcc behavior (code growth and increased stack usage).
Unfortunately after both have been applied, a more serious gcc bug has
been uncovered, leading to incorrect object code that discards part of a
function and causes undefined behavior.
As part of this problem is how __builtin_constant_p gets evaluated on an
argument passed by reference into an inline function, this avoids the
use of __builtin_constant_p() for all architectures that set
CONFIG_ARCH_USE_BUILTIN_BSWAP. Most architectures do not set
ARCH_SUPPORTS_OPTIMIZED_INLINING, which means they probably do not
suffer from the problem in the qla2xxx driver, but they might still run
into it elsewhere.
Both of the original workarounds were only merged in the 4.6 kernel, and
the bug that is fixed by this patch should only appear if both are
there, so we probably don't need to backport the fix. On the other
hand, it works by simplifying the code path and should not have any
negative effects.
[arnd@arndb.de: fix older gcc warnings]
(http://lkml.kernel.org/r/12243652.bxSxEgjgfk@wuerfel)
Link: https://lkml.org/lkml/headers/2016/4/12/1103
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=66122
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70232
Link: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646
Fixes: e3bde9568d99 ("include/linux/unaligned: force inlining of byteswap operations")
Fixes: ef3fb2422ffe ("scsi: fc: use get/put_unaligned64 for wwn access")
Link: http://lkml.kernel.org/r/1780465.XdtPJpi8Tt@wuerfel
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Josh Poimboeuf <jpoimboe@redhat.com> # on gcc-5.3
Tested-by: Quinn Tran <quinn.tran@qlogic.com>
Cc: Martin Jambor <mjambor@suse.cz>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>
Cc: James Bottomley <James.Bottomley@hansenpartnership.com>
Cc: Denys Vlasenko <dvlasenk@redhat.com>
Cc: Thomas Graf <tgraf@suug.ch>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: David Rientjes <rientjes@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Himanshu Madhani <himanshu.madhani@qlogic.com>
Cc: Jan Hubicka <hubicka@ucw.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Recently, we allow to save the stacktrace whose hashed value is 0. It
causes the problem that stackdepot could return 0 even if in success.
User of stackdepot cannot distinguish whether it is success or not so we
need to solve this problem. In this patch, 1 bit are added to handle
and make valid handle none 0 by setting this bit. After that, valid
handle will not be 0 and 0 handle will represent failure correctly.
Fixes: 33334e25769c ("lib/stackdepot.c: allow the stack trace hash to be zero")
Link: http://lkml.kernel.org/r/1462252403-1106-1-git-send-email-iamjoonsoo.kim@lge.com
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Alexander Potapenko <glider@google.com>
Cc: Andrey Ryabinin <aryabinin@virtuozzo.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Assume memory47 is the last online block left in node1. This will hang:
# echo offline > /sys/devices/system/node/node1/memory47/state
After a couple of minutes, the following pops up in dmesg:
INFO: task bash:957 blocked for more than 120 seconds.
Not tainted 4.6.0-rc6+ #6
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
bash D ffff8800b7adbaf8 0 957 951 0x00000000
Call Trace:
schedule+0x35/0x80
schedule_timeout+0x1ac/0x270
wait_for_completion+0xe1/0x120
kthread_stop+0x4f/0x110
kcompactd_stop+0x26/0x40
__offline_pages.constprop.28+0x7e6/0x840
offline_pages+0x11/0x20
memory_block_action+0x73/0x1d0
memory_subsys_offline+0x47/0x60
device_offline+0x86/0xb0
store_mem_state+0xda/0xf0
dev_attr_store+0x18/0x30
sysfs_kf_write+0x37/0x40
kernfs_fop_write+0x11d/0x170
__vfs_write+0x37/0x120
vfs_write+0xa9/0x1a0
SyS_write+0x55/0xc0
entry_SYSCALL_64_fastpath+0x1a/0xa4
kcompactd is waiting for kcompactd_max_order > 0 when it's woken up to
actually exit. Check kthread_should_stop() to break out of the wait.
Fixes: 698b1b306 ("mm, compaction: introduce kcompactd").
Reported-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Tested-by: Reza Arbab <arbab@linux.vnet.ibm.com>
Cc: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: Rik van Riel <riel@redhat.com>
Cc: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: David Rientjes <rientjes@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Hugh Dickins <hughd@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Since the wildcard at the end of OF module aliases is gone, autoloading
of modules that don't match a device's last (most generic) compatible
value fails.
For example the CODA960 VPU on i.MX6Q has the SoC specific compatible
"fsl,imx6q-vpu" and the generic compatible "cnm,coda960". Since the
driver currently only works with knowledge about the SoC specific
integration, it doesn't list "cnm,cod960" in the module device table.
This results in the device compatible
"of:NvpuT<NULL>Cfsl,imx6q-vpuCcnm,coda960" not matching the module alias
"of:N*T*Cfsl,imx6q-vpu" anymore, whereas before commit 2f632369ab79
("modpost: don't add a trailing wildcard for OF module aliases") it
matched the module alias "of:N*T*Cfsl,imx6q-vpu*".
This patch adds two module aliases for each compatible, one without the
wildcard and one with "C*" appended.
$ modinfo coda | grep imx6q
alias: of:N*T*Cfsl,imx6q-vpuC*
alias: of:N*T*Cfsl,imx6q-vpu
Fixes: 2f632369ab79 ("modpost: don't add a trailing wildcard for OF module aliases")
Link: http://lkml.kernel.org/r/1462203339-15340-1-git-send-email-p.zabel@pengutronix.de
Signed-off-by: Philipp Zabel <p.zabel@pengutronix.de>
Cc: Javier Martinez Canillas <javier@osg.samsung.com>
Cc: Brian Norris <computersforpeace@gmail.com>
Cc: Sjoerd Simons <sjoerd.simons@collabora.co.uk>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: <stable@vger.kernel.org> [4.5+]
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
If /proc/<PID>/environ gets read before the envp[] array is fully set up
in create_{aout,elf,elf_fdpic,flat}_tables(), we might end up trying to
read more bytes than are actually written, as env_start will already be
set but env_end will still be zero, making the range calculation
underflow, allowing to read beyond the end of what has been written.
Fix this as it is done for /proc/<PID>/cmdline by testing env_end for
zero. It is, apparently, intentionally set last in create_*_tables().
This bug was found by the PaX size_overflow plugin that detected the
arithmetic underflow of 'this_len = env_end - (env_start + src)' when
env_end is still zero.
The expected consequence is that userland trying to access
/proc/<PID>/environ of a not yet fully set up process may get
inconsistent data as we're in the middle of copying in the environment
variables.
Fixes: https://forums.grsecurity.net/viewtopic.php?f=3&t=4363
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=116461
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Emese Revfy <re.emese@gmail.com>
Cc: Pax Team <pageexec@freemail.hu>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Mateusz Guzik <mguzik@redhat.com>
Cc: Alexey Dobriyan <adobriyan@gmail.com>
Cc: Cyrill Gorcunov <gorcunov@openvz.org>
Cc: Jarod Wilson <jarod@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Instead of using "zswap" as the name for all zpools created, add an
atomic counter and use "zswap%x" with the counter number for each zpool
created, to provide a unique name for each new zpool.
As zsmalloc, one of the zpool implementations, requires/expects a unique
name for each pool created, zswap should provide a unique name. The
zsmalloc pool creation does not fail if a new pool with a conflicting
name is created, unless CONFIG_ZSMALLOC_STAT is enabled; in that case,
zsmalloc pool creation fails with -ENOMEM. Then zswap will be unable to
change its compressor parameter if its zpool is zsmalloc; it also will
be unable to change its zpool parameter back to zsmalloc, if it has any
existing old zpool using zsmalloc with page(s) in it. Attempts to
change the parameters will result in failure to create the zpool. This
changes zswap to provide a unique name for each zpool creation.
Fixes: f1c54846ee45 ("zswap: dynamic pool creation")
Signed-off-by: Dan Streetman <ddstreet@ieee.org>
Reported-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Reviewed-by: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Dan Streetman <dan.streetman@canonical.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
After the THP refcounting change, obtaining a compound pages from
get_user_pages() no longer allows us to assume the entire compound page
is immediately mappable from a secondary MMU.
A secondary MMU doesn't want to call get_user_pages() more than once for
each compound page, in order to know if it can map the whole compound
page. So a secondary MMU needs to know from a single get_user_pages()
invocation when it can map immediately the entire compound page to avoid
a flood of unnecessary secondary MMU faults and spurious
atomic_inc()/atomic_dec() (pages don't have to be pinned by MMU notifier
users).
Ideally instead of the page->_mapcount < 1 check, get_user_pages()
should return the granularity of the "page" mapping in the "mm" passed
to get_user_pages(). However it's non trivial change to pass the "pmd"
status belonging to the "mm" walked by get_user_pages up the stack (up
to the caller of get_user_pages). So the fix just checks if there is
not a single pte mapping on the page returned by get_user_pages, and in
turn if the caller can assume that the whole compound page is mapped in
the current "mm" (in a pmd_trans_huge()). In such case the entire
compound page is safe to map into the secondary MMU without additional
get_user_pages() calls on the surrounding tail/head pages. In addition
of being faster, not having to run other get_user_pages() calls also
reduces the memory footprint of the secondary MMU fault in case the pmd
split happened as result of memory pressure.
Without this fix after a MADV_DONTNEED (like invoked by QEMU during
postcopy live migration or balloning) or after generic swapping (with a
failure in split_huge_page() that would only result in pmd splitting and
not a physical page split), KVM would map the whole compound page into
the shadow pagetables, despite regular faults or userfaults (like
UFFDIO_COPY) may map regular pages into the primary MMU as result of the
pte faults, leading to the guest mode and userland mode going out of
sync and not working on the same memory at all times.
Any other secondary MMU notifier manager (KVM is just one of the many
MMU notifier users) will need the same information if it doesn't want to
run a flood of get_user_pages_fast and it can support multiple
granularity in the secondary MMU mappings, so I think it is justified to
be exposed not just to KVM.
The other option would be to move transparent_hugepage_adjust to
mm/huge_memory.c but that currently has all kind of KVM data structures
in it, so it's definitely not a cut-and-paste work, so I couldn't do a
fix as cleaner as this one for 4.6.
Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>
Cc: "Li, Liang Z" <liang.z.li@intel.com>
Cc: Amit Shah <amit.shah@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
/proc/sys/vm/stat_refresh warns nr_isolated_anon and nr_isolated_file go
increasingly negative under compaction: which would add delay when
should be none, or no delay when should delay. The bug in compaction
was due to a recent mmotm patch, but much older instance of the bug was
also noticed in isolate_migratepages_range() which is used for CMA and
gigantic hugepage allocations.
The bug is caused by putback_movable_pages() in an error path
decrementing the isolated counters without them being previously
incremented by acct_isolated(). Fix isolate_migratepages_range() by
removing the error-path putback, thus reaching acct_isolated() with
migratepages still isolated, and leaving putback to caller like most
other places do.
Fixes: edc2ca612496 ("mm, compaction: move pageblock checks up from isolate_migratepages_range()")
[vbabka@suse.cz: expanded the changelog]
Signed-off-by: Hugh Dickins <hughd@google.com>
Signed-off-by: Vlastimil Babka <vbabka@suse.cz>
Acked-by: Joonsoo Kim <iamjoonsoo.kim@lge.com>
Cc: Michal Hocko <mhocko@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>