Fix kprobes/x86 to support jprobes on ftrace-based kprobes.
Because of -mfentry support of ftrace, ftrace is now put
on the beginning of function where jprobes are put.
Originally ftrace-based kprobes doesn't support jprobe
because it will change regs->ip and ftrace doesn't support
changing IP and ftrace itself doesn't conflict jprobe.
However, ftrace -mfentry support moves mcount call on the
top of functions where jprobes are put. This means that
jprobe always conflicts with ftrace-based kprobe and fails.
This patch allows ftrace-based kprobes to support jprobes
by allowing to modify regs->ip and kprobes breakpoint
handler also allows to skip singlestepping because there
is a ftrace call (not an original instruction).
Link: http://lkml.kernel.org/r/20120905143125.10329.90836.stgit@localhost.localdomain
Reported-by: Fengguang Wu <fengguang.wu@intel.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Allow ftrace handlers to change RIP register (regs->ip)
in handlers. This will allow handlers to call another
function instead of original function.
Link: http://lkml.kernel.org/r/20120905143118.10329.5078.stgit@localhost.localdomain
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Current kprobe_ftrace_handler expects regs->ip == ip, but it is
incorrect (originally on x86-64). Actually, ftrace handler sets
regs->ip = ip + MCOUNT_INSN_SIZE.
kprobe_ftrace_handler must take care for that.
Link: http://lkml.kernel.org/r/20120905143112.10329.72069.stgit@localhost.localdomain
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
Adjust x86 regs.ip to ip + MCOUNT_INSN_SIZE as like as
on x86-64. This helps us to consolidate codes which use
regs->ip on both of x86/x86-64.
Link: http://lkml.kernel.org/r/20120905143100.10329.60109.stgit@localhost.localdomain
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Masami Hiramatsu <masami.hiramatsu.pt@hitachi.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
The main goal here is to have the resulting .config no carry any
options that aren't enabled and can't be (i.e such where the
default is "no" and can't be changed), so that if any such
option later gets a user visible prompt, the user will actually
be prompted on a "make ...oldconfig" rather than keeping the
previously invisible option disabled.
There's a little bit of other trivial cleanup mixed in here.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/504DEE19020000780009A285@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Following a relatively recent compiler change, make use of the
fact that for non-zero input BSF and TZCNT produce the same
result, and that CPUs not knowing of TZCNT will treat the
instruction as BSF (i.e. ignore what looks like a REP prefix to
them). The assumption here is that TZCNT would never have worse
performance than BSF.
For the moment, only do this when the respective generic-CPU
option is selected (as there are no specific-CPU options
covering the CPUs supporting TZCNT), and don't do that when size
optimization was requested.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/504DEA1B020000780009A277@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
The 64-bit special cases of the former two (the thrird one is
64-bit only anyway) don't need to use "long" temporaries, as the
result will always fit in a 32-bit variable, and the functions
return plain "int". This avoids a few REX prefixes, i.e.
minimally reduces code size.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Link: http://lkml.kernel.org/r/504DE550020000780009A258@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
On 64 bit x86 we save the current eflags in cpu_init for use in
ret_from_fork. Strictly speaking reserved bits in EFLAGS should
be read as written but in practise it is unlikely that EFLAGS
could ever be extended in this way and the kernel alread clears
any undefined flags early on.
The equivalent 32 bit code simply hard codes 0x0202 as the new
EFLAGS.
This change makes 64 bit use the same mechanism to setup the
initial EFLAGS on fork. Note that 64 bit resets EFLAGS before
calling schedule_tail() as opposed to 32 bit which calls
schedule_tail() first. Therefore the correct value for EFLAGS
has opposite IF bit.
Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Signed-off-by: Cyrill Gorcunov <gorcunov@openvz.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: "H. Peter Anvin" <hpa@zytor.com>
Cc: Brian Gerst <brgerst@gmail.com>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Pekka Enberg <penberg@kernel.org>
Cc: Andi Kleen <ak@linux.intel.com>
Link: http://lkml.kernel.org/r/20120824195847.GA31628@moon
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Steve Rostedt asked for the merge of a single commit, into both
the RCU and the perf/tracing tree:
| Josh made a change to the tracing code that affects both the
| work Paul McKenney and I are currently doing. At the last
| Kernel Summit back in August, Linus said when such a case
| exists, it is best to make a separate branch based off of his
| tree and place the change there. This way, the repositories
| that need to share the change can both pull them in and the
| SHA1 will match for both. Whichever branch is pulled in first
| by Linus will also pull in the necessary change for the other
| branch as well.
Signed-off-by: Ingo Molnar <mingo@kernel.org>
Current implementation simply ignores attribute flags. Thus, there is
no notification to userland of unsupported features. Check syscall's
attribute flags to let userland know if a feature is supported by the
kernel. This is also needed to distinguish between future kernels what
might support a feature.
Cc: <stable@vger.kernel.org> v3.5..
Signed-off-by: Robert Richter <robert.richter@amd.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120910093018.GO8285@erda.amd.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
This patch exports the clockticks event and its encoding to user level.
The clockticks event was exported for Nehalem/Westmere but not for Sandy
Bridge (client). Given that it uses a special encoding, it needs to be
exported to user tools, so users can do:
# perf stat -a -C 0 -e uncore_cbox_0/clockticks/ sleep 1
Signed-off-by: Stephane Eranian <eranian@google.com>
Acked-by: Yan, Zheng <zheng.z.yan@intel.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20120829130122.GA32336@quad
Signed-off-by: Ingo Molnar <mingo@kernel.org>
find_highest_vector() and count_vectors():
- Instead of using magic values, define and use proper macros.
find_highest_vector():
- Remove likely() which is there only for historical reasons and not
doing correct branch predictions anymore. Using such heuristics
to optimize this function is not worth it now. Let CPUs predict
things instead.
- Stop checking word[0] separately. This was only needed for doing
likely() optimization.
- Use for loop, not while, to iterate over the register array to make
the code clearer.
Note that we actually confirmed that the likely() did wrong predictions
by inserting debug code.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Takuya Yoshikawa <yoshikawa.takuya@oss.ntt.co.jp>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
If the caller passes a valid kmap_op to m2p_add_override, we use
kmap_op->dev_bus_addr to store the original mfn, but dev_bus_addr is
part of the interface with Xen and if we are batching the hypercalls it
might not have been written by the hypervisor yet. That means that later
on Xen will write to it and we'll think that the original mfn is
actually what Xen has written to it.
Rather than "stealing" struct members from kmap_op, keep using
page->index to store the original mfn and add another parameter to
m2p_remove_override to get the corresponding kmap_op instead.
It is now responsibility of the caller to keep track of which kmap_op
corresponds to a particular page in the m2p_override (gntdev, the only
user of this interface that passes a valid kmap_op, is already doing that).
CC: stable@kernel.org
Reported-and-Tested-By: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* stable/128gb.v5.1:
xen/mmu: If the revector fails, don't attempt to revector anything else.
xen/p2m: When revectoring deal with holes in the P2M array.
xen/mmu: Release just the MFN list, not MFN list and part of pagetables.
xen/mmu: Remove from __ka space PMD entries for pagetables.
xen/mmu: Copy and revector the P2M tree.
xen/p2m: Add logic to revector a P2M tree to use __va leafs.
xen/mmu: Recycle the Xen provided L4, L3, and L2 pages
xen/mmu: For 64-bit do not call xen_map_identity_early
xen/mmu: use copy_page instead of memcpy.
xen/mmu: Provide comments describing the _ka and _va aliasing issue
xen/mmu: The xen_setup_kernel_pagetable doesn't need to return anything.
Revert "xen/x86: Workaround 64-bit hypervisor and 32-bit initial domain." and "xen/x86: Use memblock_reserve for sensitive areas."
xen/x86: Workaround 64-bit hypervisor and 32-bit initial domain.
xen/x86: Use memblock_reserve for sensitive areas.
xen/p2m: Fix the comment describing the P2M tree.
Conflicts:
arch/x86/xen/mmu.c
The pagetable_init is the old xen_pagetable_setup_done and xen_pagetable_setup_start
rolled in one.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* 'x86/platform' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (9690 commits)
x86: Document x86_init.paging.pagetable_init()
x86: xen: Cleanup and remove x86_init.paging.pagetable_setup_done()
x86: Move paging_init() call to x86_init.paging.pagetable_init()
x86: Rename pagetable_setup_start() to pagetable_init()
x86: Remove base argument from x86_init.paging.pagetable_setup_start
Linux 3.6-rc5
HID: tpkbd: work even if the new Lenovo Keyboard driver is not configured
Remove user-triggerable BUG from mpol_to_str
xen/pciback: Fix proper FLR steps.
uml: fix compile error in deliver_alarm()
dj: memory scribble in logi_dj
Fix order of arguments to compat_put_time[spec|val]
xen: Use correct masking in xen_swiotlb_alloc_coherent.
xen: fix logical error in tlb flushing
xen/p2m: Fix one-off error in checking the P2M tree directory.
powerpc: Don't use __put_user() in patch_instruction
powerpc: Make sure IPI handlers see data written by IPI senders
powerpc: Restore correct DSCR in context switch
powerpc: Fix DSCR inheritance in copy_thread()
powerpc: Keep thread.dscr and thread.dscr_inherit in sync
...
At this stage x86_init.paging.pagetable_setup_done is only used in the
XEN case. Move its content in the x86_init.paging.pagetable_init setup
function and remove the now unused x86_init.paging.pagetable_setup_done
remaining infrastructure.
Signed-off-by: Attilio Rao <attilio.rao@citrix.com>
Acked-by: <konrad.wilk@oracle.com>
Cc: <Ian.Campbell@citrix.com>
Cc: <Stefano.Stabellini@eu.citrix.com>
Cc: <xen-devel@lists.xensource.com>
Link: http://lkml.kernel.org/r/1345580561-8506-5-git-send-email-attilio.rao@citrix.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Move the paging_init() call to the platform specific pagetable_init()
function, so we can get rid of the extra pagetable_setup_done()
function pointer.
Signed-off-by: Attilio Rao <attilio.rao@citrix.com>
Acked-by: <konrad.wilk@oracle.com>
Cc: <Ian.Campbell@citrix.com>
Cc: <Stefano.Stabellini@eu.citrix.com>
Cc: <xen-devel@lists.xensource.com>
Link: http://lkml.kernel.org/r/1345580561-8506-4-git-send-email-attilio.rao@citrix.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
In preparation for unifying the pagetable_setup_start() and
pagetable_setup_done() setup functions, rename appropriately all the
infrastructure related to pagetable_setup_start().
Signed-off-by: Attilio Rao <attilio.rao@citrix.com>
Ackedd-by: <konrad.wilk@oracle.com>
Cc: <Ian.Campbell@citrix.com>
Cc: <Stefano.Stabellini@eu.citrix.com>
Cc: <xen-devel@lists.xensource.com>
Link: http://lkml.kernel.org/r/1345580561-8506-3-git-send-email-attilio.rao@citrix.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
We either use swapper_pg_dir or the argument is unused. Preparatory
patch to simplify platform pagetable setup further.
Signed-off-by: Attilio Rao <attilio.rao@citrix.com>
Ackedb-by: <konrad.wilk@oracle.com>
Cc: <Ian.Campbell@citrix.com>
Cc: <Stefano.Stabellini@eu.citrix.com>
Cc: <xen-devel@lists.xensource.com>
Link: http://lkml.kernel.org/r/1345580561-8506-2-git-send-email-attilio.rao@citrix.com
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQIcBAABAgAGBQJQTgR7AAoJEI7yEDeUysxlLnsP/jsGydL5WWsujpK9jdEbNa+r
Y2D8aPgKRe6Z6ZmWpe7jxymXA4haXh1kKO8nIVgfFOgtvN+Po8rZMOQE0TpuulFy
ANYa2N9Mi4+tpi9pqjcVgafjNa7xxRer04jZr5a789ry0MXjKspoUv/rFPpOsxrL
XlaPnRu1MLnCWmnLv3nvgatth3fIjo4X31FGmUwl4dRc59NIQREek7z3gAV0x4td
+JK6qvwzYOz0LhwV/Ssk84Sm97OOfXVaKWvxfC8y0H6nt0nXqmfGI/d5tWL8F9xz
92c0lelvbm6AMgx5yY5onRYDE6SkdlnAS563umIB58jMp44Fj+0MMPSfPQM1rBXC
NZ7mh2BJJ8Dsfi2NVneWW089o+uxSfI9HkuXGEiTzcu2mjISjuJ5ih7efAAmr7xc
e8dMM6fVTpNTazd6VGvsgno8f/Hr+D6qOZwZ8DsH4QLX0NJC41Dm6loHCE3saR2x
tjGwcbJEYqnboXEd09YOGqVOExR111cBkhVmf1qvOB71ENXoMnJ1L3e+AWojc1xN
klaMM8SQd0MsgjqTx4AOpPMKkzmONN1v44sFuQfARR1iwyY6dmLcJTqMQtDG9vJP
YTpZdkmswQrAnE2xVyMWkUd3vFj2d3JRU+aELYfIcVEuxx3hsRZdR8ezcrTRZLST
s4ii8XXmb9MGlggwrmk5
=BPP1
-----END PGP SIGNATURE-----
Merge tag 'kvm-3.6-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull KVM updates from Avi Kivity:
"A trio of KVM fixes: incorrect lookup of guest cpuid, an uninitialized
variable fix, and error path cleanup fix."
* tag 'kvm-3.6-2' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: fix error paths for failed gfn_to_page() calls
KVM: x86: Check INVPCID feature bit in EBX of leaf 7
KVM: PIC: fix use of uninitialised variable.
commit b6069a9570 (filter: add MOD operation) added generic
support for modulus operation in BPF.
This patch brings JIT support for x86_64
Signed-off-by: Eric Dumazet <edumazet@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: George Bakos <gbakos@alpinista.org>
Signed-off-by: David S. Miller <davem@davemloft.net>
This bug was triggered:
[ 4220.198458] BUG: unable to handle kernel paging request at fffffffffffffffe
[ 4220.203907] IP: [<ffffffff81104d85>] put_page+0xf/0x34
......
[ 4220.237326] Call Trace:
[ 4220.237361] [<ffffffffa03830d0>] kvm_arch_destroy_vm+0xf9/0x101 [kvm]
[ 4220.237382] [<ffffffffa036fe53>] kvm_put_kvm+0xcc/0x127 [kvm]
[ 4220.237401] [<ffffffffa03702bc>] kvm_vcpu_release+0x18/0x1c [kvm]
[ 4220.237407] [<ffffffff81145425>] __fput+0x111/0x1ed
[ 4220.237411] [<ffffffff8114550f>] ____fput+0xe/0x10
[ 4220.237418] [<ffffffff81063511>] task_work_run+0x5d/0x88
[ 4220.237424] [<ffffffff8104c3f7>] do_exit+0x2bf/0x7ca
The test case:
printf(fmt, ##args); \
exit(-1);} while (0)
static int create_vm(void)
{
int sys_fd, vm_fd;
sys_fd = open("/dev/kvm", O_RDWR);
if (sys_fd < 0)
die("open /dev/kvm fail.\n");
vm_fd = ioctl(sys_fd, KVM_CREATE_VM, 0);
if (vm_fd < 0)
die("KVM_CREATE_VM fail.\n");
return vm_fd;
}
static int create_vcpu(int vm_fd)
{
int vcpu_fd;
vcpu_fd = ioctl(vm_fd, KVM_CREATE_VCPU, 0);
if (vcpu_fd < 0)
die("KVM_CREATE_VCPU ioctl.\n");
printf("Create vcpu.\n");
return vcpu_fd;
}
static void *vcpu_thread(void *arg)
{
int vm_fd = (int)(long)arg;
create_vcpu(vm_fd);
return NULL;
}
int main(int argc, char *argv[])
{
pthread_t thread;
int vm_fd;
(void)argc;
(void)argv;
vm_fd = create_vm();
pthread_create(&thread, NULL, vcpu_thread, (void *)(long)vm_fd);
printf("Exit.\n");
return 0;
}
It caused by release kvm->arch.ept_identity_map_addr which is the
error page.
The parent thread can send KILL signal to the vcpu thread when it was
exiting which stops faulting pages and potentially allocating memory.
So gfn_to_pfn/gfn_to_page may fail at this time
Fixed by checking the page before it is used
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Checking the return of kvm_mmu_get_page is unnecessary since it is
guaranteed by memory cache
Signed-off-by: Xiao Guangrong <xiaoguangrong@linux.vnet.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
KVM lapic timer and tsc deadline timer based on hrtimer,
setting a leftmost node to rb tree and then do hrtimer reprogram.
If hrtimer not configured as high resolution, hrtimer_enqueue_reprogram
do nothing and then make kvm lapic timer and tsc deadline timer fail.
Signed-off-by: Liu, Jinsong <jinsong.liu@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Some AMD systems may round the frequencies in ACPI tables to 100MHz
boundaries. We can obtain the real frequencies from MSRs, so add a quirk
to fix these frequencies up on AMD systems.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
The programming model for P-states on modern AMD CPUs is very similar to
that of Intel and VIA. It makes sense to consolidate this support into one
driver rather than duplicating functionality between two of them. This
patch adds support for AMDs with hardware P-state control to acpi-cpufreq.
Signed-off-by: Matthew Garrett <mjg@redhat.com>
Signed-off-by: Andre Przywara <andre.przywara@amd.com>
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Checks and operations on the INVPCID feature bit should use EBX
of CPUID leaf 7 instead of ECX.
Signed-off-by: Junjie Mao <junjie.mao@intel.com>
Signed-off-by: Yongjie Ren <yongjien.ren@intel.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Since the shift count settable there is used for shifting values
of type "unsigned long", its value must not match or exceed
BITS_PER_LONG (otherwise the shift operations are undefined).
Similarly, the value must not be negative (but -1 must be
permitted, as that's the value used to distinguish the case of
the fine grained flushing being disabled).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Cc: Alex Shi <alex.shi@intel.com>
Link: http://lkml.kernel.org/r/5049B65C020000780009990C@nat28.tlf.novell.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
* Fix for TLB flushing introduced in v3.6
* Fix Xen-SWIOTLB not using proper DMA mask - device had 64bit but
in a 32-bit kernel we need to allocate for coherent pages from a
32-bit pool.
* When trying to re-use P2M nodes we had a one-off error and triggered
a BUG_ON check with specific CONFIG_ option.
* When doing FLR in Xen-PCI-backend we would first do FLR then save the
PCI configuration space. We needed to do it the other way around.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJQSS7TAAoJEFjIrFwIi8fJOBQH/03JBKBbFvewhop8T5Jww2c9
SWmMgzDm5HkxeWj5XnM+zlLoHrYFFu7tyuRLiny4weE0LdRl4adJ1TVpStxap/b6
MSQKe+tZevslaReBOsMpbCk3z7fEWNlAcpm6wMp1xYmLoHcr0JMpOCmzigbf7dwM
F4UWULheih9ME3UeqDAU8qgvfv6ccZ9vempO4TDWKjxfxfWODCNMRx+Ny+C7NNRk
QeoInHJUqcRkg0q0OIciF/YYDmn8hIH7HgfqomuMb6rEv2LOieLnWVuniEPWM75s
lECxro7v2Z9s1Std0WWKcFp8VZpI2scnQYU8aL8TpewHcbzKHQYyipuKPu7FiEQ=
=GfLE
-----END PGP SIGNATURE-----
Merge tag 'stable/for-linus-3.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen
Pull Xen bug-fixes from Konrad Rzeszutek Wilk:
* Fix for TLB flushing introduced in v3.6
* Fix Xen-SWIOTLB not using proper DMA mask - device had 64bit but
in a 32-bit kernel we need to allocate for coherent pages from a
32-bit pool.
* When trying to re-use P2M nodes we had a one-off error and triggered
a BUG_ON check with specific CONFIG_ option.
* When doing FLR in Xen-PCI-backend we would first do FLR then save the
PCI configuration space. We needed to do it the other way around.
* tag 'stable/for-linus-3.6-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen:
xen/pciback: Fix proper FLR steps.
xen: Use correct masking in xen_swiotlb_alloc_coherent.
xen: fix logical error in tlb flushing
xen/p2m: Fix one-off error in checking the P2M tree directory.
Fix "constant 0xXXXXXXXXXXXXXXXX is so big it's unsigned long" sparse warnings.
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Patch replaces 'movb' instructions with 'movzbl' to break false register
dependencies, interleaves instructions better for out-of-order scheduling
and merges constant 16-bit rotation with round-key variable rotation.
tcrypt ECB results:
Intel Core i5-2450M:
size old-vs-new new-vs-generic old-vs-generic
enc dec enc dec enc dec
256 1.13x 1.19x 2.05x 2.17x 1.82x 1.82x
1k 1.18x 1.21x 2.26x 2.33x 1.93x 1.93x
8k 1.19x 1.19x 2.32x 2.33x 1.95x 1.95x
[v2]
- Do instruction interleaving another way to avoid adding new FPU<=>CPU
register moves as these cause performance drop on Bulldozer.
- Improvements to round-key variable rotation handling.
- Further interleaving improvements for better out-of-order scheduling.
Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Patch replaces 'movb' instructions with 'movzbl' to break false register
dependencies, interleaves instructions better for out-of-order scheduling
and merges constant 16-bit rotation with round-key variable rotation.
tcrypt ECB results (128bit key):
Intel Core i5-2450M:
size old-vs-new new-vs-generic old-vs-generic
enc dec enc dec enc dec
256 1.18x 1.18x 2.45x 2.47x 2.08x 2.10x
1k 1.20x 1.20x 2.73x 2.73x 2.28x 2.28x
8k 1.20x 1.19x 2.73x 2.73x 2.28x 2.29x
[v2]
- Do instruction interleaving another way to avoid adding new FPU<=>CPU
register moves as these cause performance drop on Bulldozer.
- Improvements to round-key variable rotation handling.
- Further interleaving improvements for better out-of-order scheduling.
Cc: Johannes Goetzfried <Johannes.Goetzfried@informatik.stud.uni-erlangen.de>
Signed-off-by: Jussi Kivilinna <jussi.kivilinna@mbnet.fi>
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
interrupt_bitmap is KVM_NR_INTERRUPTS bits in size,
so just use that instead of hard-coded constants
and math.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Optimize "rep ins" by allowing emulator to write back more than one
datum at a time. Introduce new operand type OP_MEM_STR which tells
writeback() that dst contains pointer to an array that should be written
back as opposite to just one data element.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Remove unneeded segment argument. Address structure already has correct
segment which was put there during decode.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Current code assumes that IO exit was due to instruction emulation
and handles execution back to emulator directly. This patch adds new
userspace IO exit completion callback that can be set by any other code
that caused IO exit to userspace.
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Other arches do not need this.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
v2: fix incorrect deletion of mmio sptes on gpa move (noticed by Takuya)
Signed-off-by: Avi Kivity <avi@redhat.com>
Introducing kvm_arch_flush_shadow_memslot, to invalidate the
translations of a single memory slot.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Callers of xen_remap_domain_range() need to know if the remap failed
because frame is currently paged out. So they can retry the remap
later on. Return -ENOENT in this case.
This assumes that the error codes returned by Xen are a subset of
those used by the kernel. It is unclear if this is defined as part of
the hypercall ABI.
Acked-by: Andres Lagar-Cavilla <andres@lagarcavilla.org>
Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
While TLB_FLUSH_ALL gets passed as 'end' argument to
flush_tlb_others(), the Xen code was made to check its 'start'
parameter. That may give a incorrect op.cmd to MMUEXT_INVLPG_MULTI
instead of MMUEXT_TLB_FLUSH_MULTI. Then it causes some page can not
be flushed from TLB.
This patch fixed this issue.
Reported-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Alex Shi <alex.shi@intel.com>
Acked-by: Jan Beulich <jbeulich@suse.com>
Tested-by: Yongjie Ren <yongjie.ren@intel.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
* commit '4cb38750d49010ae72e718d46605ac9ba5a851b4': (6849 commits)
bcma: fix invalid PMU chip control masks
[libata] pata_cmd64x: whitespace cleanup
libata-acpi: fix up for acpi_pm_device_sleep_state API
sata_dwc_460ex: device tree may specify dma_channel
ahci, trivial: fixed coding style issues related to braces
ahci_platform: add hibernation callbacks
libata-eh.c: local functions should not be exposed globally
libata-transport.c: local functions should not be exposed globally
sata_dwc_460ex: support hardreset
ata: use module_pci_driver
drivers/ata/pata_pcmcia.c: adjust suspicious bit operation
pata_imx: Convert to clk_prepare_enable/clk_disable_unprepare
ahci: Enable SB600 64bit DMA on MSI K9AGM2 (MS-7327) v2
[libata] Prevent interface errors with Seagate FreeAgent GoFlex
drivers/acpi/glue: revert accidental license-related 6b66d95895c bits
libata-acpi: add missing inlines in libata.h
i2c-omap: Add support for I2C_M_STOP message flag
i2c: Fall back to emulated SMBus if the operation isn't supported natively
i2c: Add SCCB support
i2c-tiny-usb: Add support for the Robofuzz OSIF USB/I2C converter
...
We would traverse the full P2M top directory (from 0->MAX_DOMAIN_PAGES
inclusive) when trying to figure out whether we can re-use some of the
P2M middle leafs.
Which meant that if the kernel was compiled with MAX_DOMAIN_PAGES=512
we would try to use the 512th entry. Fortunately for us the p2m_top_index
has a check for this:
BUG_ON(pfn >= MAX_P2M_PFN);
which we hit and saw this:
(XEN) domain_crash_sync called from entry.S
(XEN) Domain 0 (vcpu#0) crashed on cpu#0:
(XEN) ----[ Xen-4.1.2-OVM x86_64 debug=n Tainted: C ]----
(XEN) CPU: 0
(XEN) RIP: e033:[<ffffffff819cadeb>]
(XEN) RFLAGS: 0000000000000212 EM: 1 CONTEXT: pv guest
(XEN) rax: ffffffff81db5000 rbx: ffffffff81db4000 rcx: 0000000000000000
(XEN) rdx: 0000000000480211 rsi: 0000000000000000 rdi: ffffffff81db4000
(XEN) rbp: ffffffff81793db8 rsp: ffffffff81793d38 r8: 0000000008000000
(XEN) r9: 4000000000000000 r10: 0000000000000000 r11: ffffffff81db7000
(XEN) r12: 0000000000000ff8 r13: ffffffff81df1ff8 r14: ffffffff81db6000
(XEN) r15: 0000000000000ff8 cr0: 000000008005003b cr4: 00000000000026f0
(XEN) cr3: 0000000661795000 cr2: 0000000000000000
Fixes-Oracle-Bug: 14570662
CC: stable@vger.kernel.org # only for v3.5
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
We never modify direct_access_msrs[], msrpm_ranges[],
svm_exit_handlers[] or x86_intercept_map[] at runtime.
Mark them r/o.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Cc: Joerg Roedel <joerg.roedel@amd.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
We use vmcs_field_to_offset_table[], kvm_vmx_segment_fields[] and
kvm_vmx_exit_handlers[] as lookup tables only -- make them r/o.
Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Avi Kivity <avi@redhat.com>