commit 0b0c002c340e78173789f8afaa508070d838cf3d upstream.
... because the "clock_event_device framework" already accounts for idle
time through the "event_handler" function pointer in
xen_timer_interrupt().
The patch is intended as the completion of [1]. It should fix the double
idle times seen in PV guests' /proc/stat [2]. It should be orthogonal to
stolen time accounting (the removed code seems to be isolated).
The approach may be completely misguided.
[1] https://lkml.org/lkml/2011/10/6/10
[2] http://lists.xensource.com/archives/html/xen-devel/2010-08/msg01068.html
John took the time to retest this patch on top of v3.10 and reported:
"idle time is correctly incremented for pv and hvm for the normal
case, nohz=off and nohz=idle." so lets put this patch in.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: John Haxby <john.haxby@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03617c188f41eeeb4223c919ee7e66e5a114f2c6 upstream.
Some userspaces do not preserve unusable property. Since usable
segment has to be present according to VMX spec we can use present
property to amend userspace bug by making unusable segment always
nonpresent. vmx_segment_access_rights() already marks nonpresent segment
as unusable.
Reported-by: Stefan Pietsch <stefan.pietsch@lsexperts.de>
Tested-by: Stefan Pietsch <stefan.pietsch@lsexperts.de>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea461abf61753b4b79e625a7c20650105b990f21 upstream.
While running Linux as guest on top of phyp, we possiblly have
PE that includes single PCI device. However, we didn't return
its PCI bus correctly and it leads to failure on recovery from
EEH errors for single-dev-PE. The patch fixes the issue.
Signed-off-by: Gavin Shan <shangw@linux.vnet.ibm.com>
Cc: Steve Best <sbest@us.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 03bbcb2e7e292838bb0244f5a7816d194c911d62 upstream.
A few years back intel published a spec update:
http://www.intel.com/content/dam/doc/specification-update/5520-and-5500-chipset-ioh-specification-update.pdf
For the 5520 and 5500 chipsets which contained an errata (specificially errata
53), which noted that these chipsets can't properly do interrupt remapping, and
as a result the recommend that interrupt remapping be disabled in bios. While
many vendors have a bios update to do exactly that, not all do, and of course
not all users update their bios to a level that corrects the problem. As a
result, occasionally interrupts can arrive at a cpu even after affinity for that
interrupt has be moved, leading to lost or spurrious interrupts (usually
characterized by the message:
kernel: do_IRQ: 7.71 No irq handler for vector (irq -1)
There have been several incidents recently of people seeing this error, and
investigation has shown that they have system for which their BIOS level is such
that this feature was not properly turned off. As such, it would be good to
give them a reminder that their systems are vulnurable to this problem. For
details of those that reported the problem, please see:
https://bugzilla.redhat.com/show_bug.cgi?id=887006
[ Joerg: Removed CONFIG_IRQ_REMAP ifdef from early-quirks.c ]
Signed-off-by: Neil Horman <nhorman@tuxdriver.com>
CC: Prarit Bhargava <prarit@redhat.com>
CC: Don Zickus <dzickus@redhat.com>
CC: Don Dutile <ddutile@redhat.com>
CC: Bjorn Helgaas <bhelgaas@google.com>
CC: Asit Mallick <asit.k.mallick@intel.com>
CC: David Woodhouse <dwmw2@infradead.org>
CC: linux-pci@vger.kernel.org
CC: Joerg Roedel <joro@8bytes.org>
CC: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
CC: Arkadiusz Miśkiewicz <arekm@maven.pl>
Signed-off-by: Joerg Roedel <joro@8bytes.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 690cec8e70c211d1f5f6e520b21a68d0306173b6 upstream.
In uniprocessor configurations, synchronize_irq() is defined in
<linux/hardirq.h> as a macro, and this function definition fails to
compile.
Reported-by: kbuild test robot <fengguang.wu@intel.com>
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c46b54f7406780ec4cf9c9124d1cfb777674dc70 upstream.
All architectures must implement IRQ functions. Since various
dependencies on !S390 were removed, there are various drivers that can
be selected but will fail to link. Provide a dummy implementation of
these functions for the !PCI case.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Acked-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 63384fd0b1509acf522a8a8fcede09087eedb7df upstream.
Commit 1bc3974 (ARM: 7755/1: handle user space mapped pages in
flush_kernel_dcache_page) moved the implementation of
flush_kernel_dcache_page() into mm/flush.c but did not implement it
on noMMU ARM.
Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Acked-by: Kevin Hilman <khilman@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 1bc39742aab09248169ef9d3727c9def3528b3f3 upstream.
Commit f8b63c1 made flush_kernel_dcache_page a no-op assuming that
the pages it needs to handle are kernel mapped only. However, for
example when doing direct I/O, pages with user space mappings may
occur.
Thus, continue to do lazy flushing if there are no user space
mappings. Otherwise, flush the kernel cache lines directly.
Signed-off-by: Simon Baatz <gmbnomis@gmail.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit eda4ddf7e3a2245888e8c45c566fd514cdd5abbb upstream.
The following git commit changed the behavior of sscanf:
commit 53809751ac230a3611b5cdd375f3389f3207d471
Author: Jan Beulich <JBeulich@suse.com>
Date: Mon Dec 17 16:01:31 2012 -0800
sscanf: don't ignore field widths for numeric conversions
This broke the WWPN and LUN sysfs attributes for s390 reipl and dump
on panic.
Example:
$ echo 0x0123456701234567 > /sys/firmware/reipl/fcp/wwpn
$ cat /sys/firmware/reipl/fcp/wwpn
0x0001234567012345
So fix this and use format strings that work also with the
new sscanf implementation:
$ echo 0x012345670123456789 > /sys/firmware/reipl/fcp/wwpn
$ cat /sys/firmware/reipl/fcp/wwpn
0x0123456701234567
Reviewed-by: Steffen Maier <maier@linux.vnet.ibm.com>
Signed-off-by: Michael Holzheu <holzheu@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 949785996ec2250fa958fc3a924e5186e9a8fa2c upstream.
We are in the process of removing all the __cpuinit annotations.
While working on making that change, an existing problem was
made evident:
WARNING: arch/x86/kernel/built-in.o(.text+0x198f2): Section mismatch
in reference from the function cpu_init() to the function
.init.text:load_ucode_ap() The function cpu_init() references
the function __init load_ucode_ap(). This is often because cpu_init
lacks a __init annotation or the annotation of load_ucode_ap is wrong.
This now appears because in my working tree, cpu_init() is no longer
tagged as __cpuinit, and so the audit picks up the mismatch. The 2nd
hypothesis from the audit is the correct one, as there was an incorrect
__init tag on the prototype in the header (but __cpuinit was used on
the function itself.)
The audit is telling us that the prototype's __init annotation took
effect and the function did land in the .init.text section. Checking
with objdump on a mainline tree that still has __cpuinit shows that
the __cpuinit on the function takes precedence over the __init on the
prototype, but that won't be true once we make __cpuinit a no-op.
Even though we are removing __cpuinit, we temporarily align both
the function and the prototype on __cpuinit so that the changeset
can be applied to stable trees if desired.
[ hpa: build fix only, no object code change ]
Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Link: http://lkml.kernel.org/r/1371654926-11729-1-git-send-email-paul.gortmaker@windriver.com
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d1603990ea626668c78527376d9ec084d634202d upstream.
Fix kconfig warning and build errors on x86_64 by selecting BINFMT_ELF
when COMPAT_BINFMT_ELF is being selected.
warning: (IA32_EMULATION) selects COMPAT_BINFMT_ELF which has unmet direct dependencies (COMPAT && BINFMT_ELF)
fs/built-in.o: In function `elf_core_dump':
compat_binfmt_elf.c:(.text+0x3e093): undefined reference to `elf_core_extra_phdrs'
compat_binfmt_elf.c:(.text+0x3ebcd): undefined reference to `elf_core_extra_data_size'
compat_binfmt_elf.c:(.text+0x3eddd): undefined reference to `elf_core_write_extra_phdrs'
compat_binfmt_elf.c:(.text+0x3f004): undefined reference to `elf_core_write_extra_data'
[ hpa: This was sent to me for -next but it is a low risk build fix ]
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Link: http://lkml.kernel.org/r/51C0B614.5000708@infradead.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d8d386c10630d8f7837700f4c466443d49e12cc0 upstream.
Joshua reported: Commit cd7b304dfaf1 (x86, range: fix missing merge
during add range) broke mtrr cleanup on his setup in 3.9.5.
corresponding commit in upstream is fbe06b7bae7c.
*BAD*gran_size: 64K chunk_size: 16M num_reg: 6 lose cover RAM: -0G
https://bugzilla.kernel.org/show_bug.cgi?id=59491
So it rejects new var mtrr layout.
It turns out we have some problem with initial mtrr range retrieval.
The current sequence is:
x86_get_mtrr_mem_range
==> bunchs of add_range_with_merge
==> bunchs of subract_range
==> clean_sort_range
add_range_with_merge for [0,1M)
sort_range()
add_range_with_merge could have blank slots, so we can not just
sort only, that will have final result have extra blank slot in head.
So move that calling add_range_with_merge for [0,1M), with that we
could avoid extra clean_sort_range calling.
Reported-by: Joshua Covington <joshuacov@googlemail.com>
Tested-by: Joshua Covington <joshuacov@googlemail.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1371154622-8929-2-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 764bcbc5a6d7a2f3e75c9f0e4caa984e2926e346 upstream.
__kvm_set_xcr function does the CPL check when set xcr. __kvm_set_xcr is
called in two flows, one is invoked by guest, call stack shown as below,
handle_xsetbv(or xsetbv_interception)
kvm_set_xcr
__kvm_set_xcr
the other one is invoked by host, for example during system reset:
kvm_arch_vcpu_ioctl
kvm_vcpu_ioctl_x86_set_xcrs
__kvm_set_xcr
The former does need the CPL check, but the latter does not.
Signed-off-by: Zhang Haoyu <haoyu.zhang@huawei.com>
[Tweaks to commit message. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 07868fc6aaf57847b0f3a3d53086b7556eb83f4a upstream.
kernel might hung in pvclock_clocksource_read() due to
uninitialized memory might contain odd version value in
following cycle:
do {
version = __pvclock_read_cycles(src, &ret, &flags);
} while ((src->version & 1) || version != src->version);
if secondary kvmclock is accessed before it's registered with kvm.
Clear garbage in pvclock shared memory area right after it's
allocated to avoid this issue.
Ref: https://bugzilla.kernel.org/show_bug.cgi?id=59521
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
[See BZ for analysis. We may want a different fix for 3.11, but
this is the safest for now - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b8cb62f82103083a6e8fa5470bfe634a2c06514d upstream.
1. Check for allocation failure
2. Clear the buffer contents, as they may actually be written to flash
3. Don't leak the buffer
Compile-tested only.
[ Tested successfully on my buggy ASUS machine - Matt ]
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 2cc7138f4347df939ce03f313e3d87794bab36f8 upstream.
pci_mmap_page_range() is needed for X11-server support on C8000 with ATI
FireGL card.
Signed-off-by Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9a66d1869d90f13fbaf83dcce5b1aeec86fbc699 upstream.
The C8000 workstation (64 bit kernel only) has a somewhat different
serial port configuration than other models.
Thomas Bogendoerfer sent a patch to fix this in September 2010, which
was now minimally modified by me.
Signed-off-by: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 91ea8207168793b365322be3c90a4ee9e8b03ed4 upstream.
Make sure that we really return -1 (instead of 0x00ff) as node id for
page frame numbers which are not physically available.
This finally fixes the kernel panic when running
cat /proc/kpageflags /proc/kpagecount.
Theoretically this patch now limits the number of physical memory ranges
to 127 instead of 254, but currently we have MAX_PHYSMEM_RANGES
hardcoded to 8 which is sufficient for all existing parisc machines.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ea99b1adf22abd62bdcf14b1c9a0a4d3664eefd8 upstream.
'boot_args' is an input args, and 'boot_command_line' has a fix length.
So use strlcpy() instead of strcpy() to avoid memory overflow.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Acked-by: Kyle McMartin <kyle@mcmartin.ca>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 766039022a480ede847659daaa78772bdcc598ae upstream.
There's a Makefile line setting cflags for CONFIG_PA7100. But that
Kconfig macro doesn't exist. There is a Kconfig symbol PA7000, which
covers both PA7000 and PA7100 processors. So let's use the corresponding
Kconfig macro.
Signed-off-by: Paul Bolle <pebolle@tiscali.nl>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit ae249b5fa27f9fba25aa59664d4338efc2dd2394 upstream.
With CONFIG_DISCONTIGMEM=y and multiple physical memory areas,
cat /proc/kpageflags triggers this kernel bug:
kernel BUG at arch/parisc/include/asm/mmzone.h:50!
CPU: 2 PID: 7848 Comm: cat Tainted: G D W 3.10.0-rc3-64bit #44
IAOQ[0]: kpageflags_read0x128/0x238
IAOQ[1]: kpageflags_read0x12c/0x238
RP(r2): proc_reg_read0xbc/0x130
Backtrace:
[<00000000402ca2d4>] proc_reg_read0xbc/0x130
[<0000000040235bcc>] vfs_read0xc4/0x1d0
[<0000000040235f0c>] SyS_read0x94/0xf0
[<0000000040105fc0>] syscall_exit0x0/0x14
kpageflags_read() walks through the whole memory, even if some memory
areas are physically not available. So, we should better not BUG on an
unavailable pfn in pfn_to_nid() but just return the expected value -1 or
0.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3f108de96ba449a8df3d7e3c053bf890fee2cb95 upstream.
'path.bc[i]' can be asigned by PCI_SLOT() which can '> 10', so sizeof(6
* "%u:" + "%u" + '\0') may be 21.
Since 'name' length is 20, it may be memory overflow.
And 'path.bc[i]' is 'unsigned char' for printing, we can be sure the
max length of 'name' must be less than 28.
So simplify thinking, we can use 28 instead of 20 directly, and do not
think of whether 'patchc.bc[i]' can '> 100'.
Signed-off-by: Chen Gang <gang.chen@asianux.com>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d96b51ec14650b490ab98e738bcc02309396e5bc upstream.
The logic to detect if the irq stack was already in use with
raw_spin_trylock() is wrong, because it will generate a "trylock failure
on UP" error message with CONFIG_SMP=n and CONFIG_DEBUG_SPINLOCK=y.
arch_spin_trylock() can't be used either since in the CONFIG_SMP=n case
no atomic protection is given and we are reentrant here. A mutex didn't
worked either and brings more overhead by turning off interrupts.
So, let's use the fastest path for parisc which is the ldcw instruction.
Counting how often the irq stack was used is pretty useless, so just
drop this piece of code.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit b63a2bbc0b9b106a93e11952ab057e2408f2eb02 upstream.
The get_stack_use_cr30 and get_stack_use_r30 macros allocate a stack
frame for external interrupts and interruptions requiring a stack frame.
They are currently not reentrant in that they save register context
before the stack is set or adjusted.
I have observed a number of system crashes where there was clear
evidence of stack corruption during interrupt processing, and as a
result register corruption. Some interruptions can still occur during
interruption processing, however external interrupts are disabled and
data TLB misses don't occur for absolute accesses. So, it's not entirely
clear what triggers this issue. Also, if an interruption occurs when
Q=0, it is generally not possible to recover as the shadowed registers
are not copied.
The attached patch reworks the get_stack_use_cr30 and get_stack_use_r30
macros to allocate stack before doing register saves. The new code is a
couple of instructions shorter than the old implementation. Thus, it's
an improvement even if it doesn't fully resolve the stack corruption
issue. Based on limited testing, it improves SMP system stability.
Signed-off-by: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit d0c3be806a3fe7f4abdb0f7e7287addb55e73f35 upstream.
Show number of floating point assistant and unaligned access fixup
handler in /proc/interrupts file.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 416821d3d68164909b2cbcf398e4ba0797f5f8a2 upstream.
This patch fixes few build issues which were introduced with the last
irq stack patch, e.g. the combination of stack overflow check and irq
stack.
Furthermore we now do proper locking and change the irq bh handler
to use the irq stack as well.
In /proc/interrupts one now can monitor how huge the irq stack has grown
and how often it was preferred over the kernel stack.
IRQ stacks are now enabled by default just to make sure that we not
overflow the kernel stack by accident.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0fc537d1d655cdae69b489dbba46ad617cfc1373 upstream.
Fix up build error on UP and show correctly number of function call
(ipi) irqs.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit cd85d5514d5c4d7e78abac923fc032457d0c5091 upstream.
Add framework and initial values for more fine grained statistics in
/proc/interrupts.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 200c880420a2c02a0899120ce52d801fad705b90 upstream.
Default kernel stack size on parisc is 16k. During tests we found that the
kernel stack can easily grow beyond 13k, which leaves 3k left for irq
processing.
This patch adds the possibility to activate an additional stack of 16k per CPU
which is being used during irq processing. This implementation does not yet
uses this irq stack for the irq bh handler.
The assembler code for call_on_stack was heavily cleaned up by John
David Anglin.
Signed-off-by: Helge Deller <deller@gmx.de>
CC: John David Anglin <dave.anglin@bell.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 9372450cc22d185f708e5cc3557cf991be4b7dc5 upstream.
Add the CONFIG_DEBUG_STACKOVERFLOW config option to enable checks to
detect kernel stack overflows.
Stack overflows can not be detected reliable since we do not want to
introduce too much overhead.
Instead, during irq processing in do_cpu_irq_mask() we check kernel
stack usage of the interrupted kernel process. Kernel threads can be
easily detected by checking the value of space register 7 (sr7) which
is zero when running inside the kernel.
Since THREAD_SIZE is 16k and PAGE_SIZE is 4k, reduce the alignment of
the init thread to the lower value (PAGE_SIZE) in the kernel
vmlinux.ld.S linker script.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 3cb3f839d306443f3d1e79b0bde1a2ad2c12b555 upstream.
gcc 4.7.x is emitting calls to __ffsdi2 where previously
it used to inline the appropriate ctz instructions.
While this needs to be fixed in gcc, it's also easy to avoid
having it cause build failures when building with those
compilers by exporting __ffsdi2 to modules.
Signed-off-by: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit abc41254181e901ef5eda2c884ca6cd88a186b6d upstream.
With this change, we no longer lose the innermost entry in the user-mode
part of the call chain. See also the x86 port, which includes the ip,
and the corresponding change in arch/arm.
Signed-off-by: Jed Davis <jld@mozilla.com>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 049be07053ebbf0ee8543caea23ae7bdf0765bb2 upstream.
This commit fixes the ID and mask for the PJ4B which was too
restrictive and didn't match the CPU of the Armada 370 SoC.
Signed-off-by: Gregory CLEMENT <gregory.clement@free-electrons.com>
Reviewed-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 691557941af4c12bd307ad81a4d9fa9c7743ac28 upstream.
On Cortex-A9 before version r1p0, the LoUIS bit field of the CLIDR
register returns zero when it should return one. This leads to cache
maintenance operations which rely on this value to not function as
intended, causing data corruption.
The workaround for this errata is to detect affected CPUs and correct
the LoUIS value read.
Acked-by: Will Deacon <will.deacon@arm.com>
Acked-by: Nicolas Pitre <nico@linaro.org>
Signed-off-by: Jon Medhurst <tixy@linaro.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 4089fe95bfed295c8ad38251d5fe02b6b0ba684c upstream.
MPP_F6281_MASK would be previously be returned when on mv88f6282,
which would disallow some valid MPP configurations.
Commit 830f8b91 (arm: plat-orion: fix printing of "MPP config
unavailable on this hardware") made this problem visible as an invalid
MPP configuration is now correctly detected and not applied.
Signed-off-by: Nicolas Schichan <nschichan@freebox.fr>
Signed-off-by: Jason Cooper <jason@lakedaemon.net>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 230b3034793247f61e6a0b08c44cf415f6d92981 upstream.
When replaying interrupts (as a result of the interrupt occurring
while soft-disabled), in the case of the decrementer, we are exclusively
testing for a pending timer target. However we also use decrementer
interrupts to trigger the new "irq_work", which in this case would
be missed.
This change the logic to force a replay in both cases of a timer
boundary reached and a decrementer interrupt having actually occurred
while disabled. The former test is still useful to catch cases where
a CPU having been hard-disabled for a long time completely misses the
interrupt due to a decrementer rollover.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Tested-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit bf593907f7236e95698a76b7c7a2bbf8b1165327 upstream.
Normally, the kernel emulates a few instructions that are unimplemented
on some processors (e.g. the old dcba instruction), or privileged (e.g.
mfpvr). The emulation of unimplemented instructions is currently not
working on the PowerNV platform. The reason is that on these machines,
unimplemented and illegal instructions cause a hypervisor emulation
assist interrupt, rather than a program interrupt as on older CPUs.
Our vector for the emulation assist interrupt just calls
program_check_exception() directly, without setting the bit in SRR1
that indicates an illegal instruction interrupt. This fixes it by
making the emulation assist interrupt set that bit before calling
program_check_interrupt(). With this, old programs that use no-longer
implemented instructions such as dcba now work again.
Signed-off-by: Paul Mackerras <paulus@samba.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 0e37739b1c96d65e6433998454985de994383019 upstream.
It's possible for us to crash when running with ftrace enabled, eg:
Bad kernel stack pointer bffffd12 at c00000000000a454
cpu 0x3: Vector: 300 (Data Access) at [c00000000ffe3d40]
pc: c00000000000a454: resume_kernel+0x34/0x60
lr: c00000000000335c: performance_monitor_common+0x15c/0x180
sp: bffffd12
msr: 8000000000001032
dar: bffffd12
dsisr: 42000000
If we look at current's stack (paca->__current->stack) we see it is
equal to c0000002ecab0000. Our stack is 16K, and comparing to
paca->kstack (c0000002ecab3e30) we can see that we have overflowed our
kernel stack. This leads to us writing over our struct thread_info, and
in this case we have corrupted thread_info->flags and set
_TIF_EMULATE_STACK_STORE.
Dumping the stack we see:
3:mon> t c0000002ecab0000
[c0000002ecab0000] c00000000002131c .performance_monitor_exception+0x5c/0x70
[c0000002ecab0080] c00000000000335c performance_monitor_common+0x15c/0x180
--- Exception: f01 (Performance Monitor) at c0000000000fb2ec .trace_hardirqs_off+0x1c/0x30
[c0000002ecab0370] c00000000016fdb0 .trace_graph_entry+0xb0/0x280 (unreliable)
[c0000002ecab0410] c00000000003d038 .prepare_ftrace_return+0x98/0x130
[c0000002ecab04b0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
[c0000002ecab0520] c0000000000d6b58 .idle_cpu+0x18/0x90
[c0000002ecab05a0] c00000000000a934 .return_to_handler+0x0/0x34
[c0000002ecab0620] c00000000001e660 .timer_interrupt+0x160/0x300
[c0000002ecab06d0] c0000000000025dc decrementer_common+0x15c/0x180
--- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
[c0000002ecab09c0] c0000000000fe044 .trace_hardirqs_on+0x14/0x30 (unreliable)
[c0000002ecab0fb0] c00000000016fe3c .trace_graph_entry+0x13c/0x280
[c0000002ecab1050] c00000000003d038 .prepare_ftrace_return+0x98/0x130
[c0000002ecab10f0] c00000000000a920 .ftrace_graph_caller+0x14/0x28
[c0000002ecab1160] c0000000000161f0 .__ppc64_runlatch_on+0x10/0x40
[c0000002ecab11d0] c00000000000a934 .return_to_handler+0x0/0x34
--- Exception: 901 (Decrementer) at c0000000000104d4 .arch_local_irq_restore+0x74/0xa0
... and so on
__ppc64_runlatch_on() is called from RUNLATCH_ON in the exception entry
path. At that point the irq state is not consistent, ie. interrupts are
hard disabled (by the exception entry), but the paca soft-enabled flag
may be out of sync.
This leads to the local_irq_restore() in trace_graph_entry() actually
enabling interrupts, which we do not want. Because we have not yet
reprogrammed the decrementer we immediately take another decrementer
exception, and recurse.
The fix is twofold. Firstly make sure we call DISABLE_INTS before
calling RUNLATCH_ON. The badly named DISABLE_INTS actually reconciles
the irq state in the paca with the hardware, making it safe again to
call local_irq_save/restore().
Although that should be sufficient to fix the bug, we also mark the
runlatch routines as notrace. They are called very early in the
exception entry and we are asking for trouble tracing them. They are
also fairly uninteresting and tracing them just adds unnecessary
overhead.
[ This regression was introduced by fe1952fc0afb9a2e4c79f103c08aef5d13db1873
"powerpc: Rework runlatch code" by myself --BenH
]
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f8b8404337de4e2466e2e1139ea68b1f8295974f upstream.
This patch reworks the UEFI anti-bricking code, including an effective
reversion of cc5a080c and 31ff2f20. It turns out that calling
QueryVariableInfo() from boot services results in some firmware
implementations jumping to physical addresses even after entering virtual
mode, so until we have 1:1 mappings for UEFI runtime space this isn't
going to work so well.
Reverting these gets us back to the situation where we'd refuse to create
variables on some systems because they classify deleted variables as "used"
until the firmware triggers a garbage collection run, which they won't do
until they reach a lower threshold. This results in it being impossible to
install a bootloader, which is unhelpful.
Feedback from Samsung indicates that the firmware doesn't need more than
5KB of storage space for its own purposes, so that seems like a reasonable
threshold. However, there's still no guarantee that a platform will attempt
garbage collection merely because it drops below this threshold. It seems
that this is often only triggered if an attempt to write generates a
genuine EFI_OUT_OF_RESOURCES error. We can force that by attempting to
create a variable larger than the remaining space. This should fail, but if
it somehow succeeds we can then immediately delete it.
I've tested this on the UEFI machines I have available, but I don't have
a Samsung and so can't verify that it avoids the bricking problem.
Signed-off-by: Matthew Garrett <matthew.garrett@nebula.com>
Signed-off-by: Lee, Chun-Y <jlee@suse.com> [ dummy variable cleanup ]
Signed-off-by: Matt Fleming <matt.fleming@intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c8a22d19dd238ede87aa0ac4f7dbea8da039b9c1 upstream.
Fixes a typo in register clearing code. Thanks to PaX Team for fixing
this originally, and James Troup for pointing it out.
Signed-off-by: Kees Cook <keescook@chromium.org>
Link: http://lkml.kernel.org/r/20130605184718.GA8396@www.outflux.net
Cc: PaX Team <pageexec@freemail.hu>
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 7de3d66b1387ddf5a37d9689e5eb8510fb75c765 upstream.
Commit
8d57470d x86, mm: setup page table in top-down
causes a kernel panic while setting mem=2G.
[mem 0x00000000-0x000fffff] page 4k
[mem 0x7fe00000-0x7fffffff] page 1G
[mem 0x7c000000-0x7fdfffff] page 1G
[mem 0x00100000-0x001fffff] page 4k
[mem 0x00200000-0x7bffffff] page 2M
for last entry is not what we want, we should have
[mem 0x00200000-0x3fffffff] page 2M
[mem 0x40000000-0x7bffffff] page 1G
Actually we merge the continuous ranges with same page size too early.
in this case, before merging we have
[mem 0x00200000-0x3fffffff] page 2M
[mem 0x40000000-0x7bffffff] page 2M
after merging them, will get
[mem 0x00200000-0x7bffffff] page 2M
even we can use 1G page to map
[mem 0x40000000-0x7bffffff]
that will cause problem, because we already map
[mem 0x7fe00000-0x7fffffff] page 1G
[mem 0x7c000000-0x7fdfffff] page 1G
with 1G page, aka [0x40000000-0x7fffffff] is mapped with 1G page already.
During phys_pud_init() for [0x40000000-0x7bffffff], it will not
reuse existing that pud page, and allocate new one then try to use
2M page to map it instead, as page_size_mask does not include
PG_LEVEL_1G. At end will have [7c000000-0x7fffffff] not mapped, loop
in phys_pmd_init stop mapping at 0x7bffffff.
That is right behavoir, it maps exact range with exact page size that
we ask, and we should explicitly call it to map [7c000000-0x7fffffff]
before or after mapping 0x40000000-0x7bffffff.
Anyway we need to make sure ranges' page_size_mask correct and consistent
after split_mem_range for each range.
Fix that by calling adjust_range_size_mask before merging range
with same page size.
-v2: update change log.
-v3: add more explanation why [7c000000-0x7fffffff] is not mapped, and
it causes panic.
Bisected-by: "Xie, ChanglongX" <changlongx.xie@intel.com>
Bisected-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reported-and-tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
Link: http://lkml.kernel.org/r/1370015587-20835-1-git-send-email-yinghai@kernel.org
Signed-off-by: H. Peter Anvin <hpa@linux.intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Commit 52f36be0f4e2 's390/pgtable: Fix check for pgste/storage key
handling', which was commit b56433cb782d upstream, added a use of
pgste to ptep_modify_prot_start(), but this variable does not exist.
In mainline, pgste was added by commit d3383632d4e8 's390/mm: add pte
invalidation notifier for kvm' and initialised to the return value of
pgste_get_lock(ptep). Initialise it similarly here.
Compile-tested only.
Signed-off-by: Ben Hutchings <ben@decadent.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit 466318a87f28cb3ba0d08a3b7ef1a37ae73d5aa7 upstream.
The xen_play_dead is an undead function. When the vCPU is told to
offline it ends up calling xen_play_dead wherin it calls the
VCPUOP_down hypercall which offlines the vCPU. However, when the
vCPU is onlined back, it resumes execution right after
VCPUOP_down hypercall.
That was OK (albeit the API for play_dead assumes that the CPU
stays dead and never returns) but with commit 4b0c0f294
(tick: Cleanup NOHZ per cpu data on cpu down) that is no longer safe
as said commit resets the ts->inidle which at the start of the
cpu_idle loop was set.
The net effect is that we get this warn:
Broke affinity for irq 16
installing Xen timer for CPU 1
cpu 1 spinlock event irq 48
------------[ cut here ]------------
WARNING: at /home/konrad/linux-linus/kernel/time/tick-sched.c:935 tick_nohz_idle_exit+0x195/0x1b0()
Modules linked in: dm_multipath dm_mod xen_evtchn iscsi_boot_sysfs
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 3.10.0-rc3upstream-00068-gdcdbe33 #1
Hardware name: BIOSTAR Group N61PB-M2S/N61PB-M2S, BIOS 6.00 PG 09/03/2009
ffffffff8193b448 ffff880039da5e60 ffffffff816707c8 ffff880039da5ea0
ffffffff8108ce8b ffff880039da4010 ffff88003fa8e500 ffff880039da4010
0000000000000001 ffff880039da4000 ffff880039da4010 ffff880039da5eb0
Call Trace:
[<ffffffff816707c8>] dump_stack+0x19/0x1b
[<ffffffff8108ce8b>] warn_slowpath_common+0x6b/0xa0
[<ffffffff8108ced5>] warn_slowpath_null+0x15/0x20
[<ffffffff810e4745>] tick_nohz_idle_exit+0x195/0x1b0
[<ffffffff810da755>] cpu_startup_entry+0x205/0x250
[<ffffffff81661070>] cpu_bringup_and_idle+0x13/0x15
---[ end trace 915c8c486004dda1 ]---
b/c ts_inidle is set to zero. Thomas suggested that we just add a workaround
to call tick_nohz_idle_enter before returning from xen_play_dead() - and
that is what this patch does and fixes the issue.
We also add the stable part b/c git commit 4b0c0f294 is on the stable
tree.
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
commit d82fb31abc46620b7c22758c75707069f2763646 upstream.
On pseries machines the detection for max_bus_speed should be done
through an OpenFirmware property. This patch adds a function to perform
this detection and a hook to perform dynamic adding of the function only
for pseries. This is done by overwriting the weak
pcibios_root_bridge_prepare function which is called by
pci_create_root_bus().
From: Lucas Kannebley Tavares <lucaskt@linux.vnet.ibm.com>
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit f1dd153121dcb872ae6cba8d52bec97519eb7d97 upstream.
Recent commit e61133dda480062d221f09e4fc18f66763f8ecd0 added support
for a new firmware feature to force an adapter to use 32 bit MSIs.
However, this firmware is not available for all systems. The hack below
allows devices needing 32 bit MSIs to work on these systems as well.
It is careful to only enable this on Gen2 slots, which should limit
this to configurations where this hack is needed and tested to work.
[Small change to factor out the hack into a separate function -- BenH]
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit e61133dda480062d221f09e4fc18f66763f8ecd0 upstream.
The following patch implements a new PAPR change which allows
the OS to force the use of 32 bit MSIs, regardless of what
the PCI capabilities indicate. This is required for some
devices that advertise support for 64 bit MSIs but don't
actually support them.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
commit c2e1d84523ad2a19e5be08c1f01999cc9e82652e upstream.
Add a PCI quirk for VGA devices on Power to set the default VGA device.
Ensures a default VGA is always set if a graphics adapter is present,
even if firmware did not initialize it. If more than one graphics
adapter is present, ensure the one initialized by firmware is set
as the default VGA device. This ensures that X autoconfiguration
will work.
Signed-off-by: Brian King <brking@linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: Kleber Sacilotto de Souza <klebers@linux.vnet.ibm.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>