3260 Commits

Author SHA1 Message Date
Alexander Gordeev
5caca32fba s390/cpcmd: use physical address for command and response
Virtual Console Function DIAGNOSE 8 accepts physical
addresses of command and response strings.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-26 15:21:28 +02:00
Claudio Imbrenda
f0a1a0615a KVM: s390: pv: avoid stalls when making pages secure
Improve make_secure_pte to avoid stalls when the system is heavily
overcommitted. This was especially problematic in kvm_s390_pv_unpack,
because of the loop over all pages that needed unpacking.

Due to the locks being held, it was not possible to simply replace
uv_call with uv_call_sched. A more complex approach was
needed, in which uv_call is replaced with __uv_call, which does not
loop. When the UVC needs to be executed again, -EAGAIN is returned, and
the caller (or its caller) will try again.

When -EAGAIN is returned, the path is the same as when the page is in
writeback (and the writeback check is also performed, which is
harmless).

Fixes: 214d9bbcd3a672 ("s390/mm: provide memory management functions for protected KVM guests")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Link: https://lore.kernel.org/r/20210920132502.36111-5-imbrenda@linux.ibm.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-10-25 09:20:39 +02:00
David Hildenbrand
46c22ffd27 s390/uv: fully validate the VMA before calling follow_page()
We should not walk/touch page tables outside of VMA boundaries when
holding only the mmap sem in read mode. Evil user space can modify the
VMA layout just before this function runs and e.g., trigger races with
page table removal code since commit dd2283f2605e ("mm: mmap: zap pages
with read mmap_sem in munmap").

find_vma() does not check if the address is >= the VMA start address;
use vma_lookup() instead.

Fixes: 214d9bbcd3a6 ("s390/mm: provide memory management functions for protected KVM guests")
Signed-off-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.ibm.com>
Acked-by: Heiko Carstens <hca@linux.ibm.com>
Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
Link: https://lore.kernel.org/r/20210909162248.14969-6-david@redhat.com
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
2021-10-25 09:20:38 +02:00
Eric W. Biederman
9fd5a04d8e exit: Remove calls of do_exit after noreturn versions of die
On nds32, openrisc, s390, sh, and xtensa the function die never
returns.  Mark die __noreturn so that no one expects die to return.
Remove the do_exit calls after die as they will never be reached.

Cc: Jonas Bonn <jonas@southpole.se>
Cc: Stefan Kristiansson <stefan.kristiansson@saunalahti.fi>
Cc: Stafford Horne <shorne@gmail.com>
Cc: openrisc@lists.librecores.org
Cc: Nick Hu <nickhu@andestech.com>
Cc: Greentime Hu <green.hu@gmail.com>
Cc: Vincent Chen <deanbo422@gmail.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: linux-s390@vger.kernel.org
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Rich Felker <dalias@libc.org>
Cc: linux-sh@vger.kernel.org
Cc: linux-xtensa@linux-xtensa.org
Cc: Chris Zankel <chris@zankel.net>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Fixes: 2.3.16
Fixes: 2.3.99-pre8
Fixes: 3f65ce4d141e ("[PATCH] xtensa: Architecture support for Tensilica Xtensa Part 5")
Fixes: 664eec400bf8 ("nds32: MMU fault handling and page table management")
Fixes: 61e85e367535 ("OpenRISC: Memory management")
Link: https://lkml.kernel.org/r/20211020174406.17889-2-ebiederm@xmission.com
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
2021-10-20 13:09:47 -05:00
Heiko Carstens
3d487acf1b s390: make STACK_FRAME_OVERHEAD available via asm-offsets.h
Make STACK_FRAME_OVERHEAD available via asm-offsets.h. This allows to
add s390 specific asm code to e.g. ftrace samples, without requiring
to add random header files, which might cause all sort of problems on
other architectures. asm-offsets.h can be assumed to be non-problematic.

Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20211012133802.2460757-3-hca@linux.ibm.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-19 15:39:53 +02:00
Heiko Carstens
2ab3a0a9fa s390/ftrace: add HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALL support
This is the s390 variant of commit 562955fe6a55 ("ftrace/x86: Add
register_ftrace_direct() for custom trampolines").

Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Sven Schnelle <svens@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20211012133802.2460757-2-hca@linux.ibm.com
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-19 15:39:53 +02:00
Kees Cook
42a20f86dc sched: Add wrapper for get_wchan() to keep task blocked
Having a stable wchan means the process must be blocked and for it to
stay that way while performing stack unwinding.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Kees Cook <keescook@chromium.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> [arm]
Tested-by: Mark Rutland <mark.rutland@arm.com> [arm64]
Link: https://lkml.kernel.org/r/20211008111626.332092234@infradead.org
2021-10-15 11:25:14 +02:00
Heiko Carstens
894979689d s390/ftrace: provide separate ftrace_caller/ftrace_regs_caller implementations
ftrace_regs_caller is an alias to ftrace_caller - making ftrace_caller
quite heavyweight. Split the function and provide an ftrace_caller
implementation which comes with fewer instructions. Especially getting
rid of 'stosm' on each function entry should help here, e.g. to
have less performance impact on live patched functions.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-11 20:55:58 +02:00
Heiko Carstens
0c14c03795 s390/jump_label: add __init_or_module annotation
Add missing __init_or_module to arch_jump_label_transform_static().

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-11 20:55:58 +02:00
Heiko Carstens
acd6c9afc6 s390/jump_label: rename __jump_label_transform()
Trivial patch just to get rid of the leading underscores.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-11 20:55:58 +02:00
Heiko Carstens
4e0502b8b3 s390/jump_label: make use of HAVE_JUMP_LABEL_BATCH
Specify HAVE_JUMP_LABEL_BATCH in header file. This allows to make use
of the arch_jump_label_transform_queue()/arch_jump_label_transform_apply()
mechanism.

However unlike on x86, which currently is the only user of this
mechanism, the to be patched instructions are still directly
modified. The only difference to before is that serialization is only
done after all instructions have been modified. This way the number of
serialization/synchronization events is reduced.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-11 20:55:58 +02:00
Heiko Carstens
e5873d6f7a s390/ftrace: add missing serialization for graph caller patching
CPUs must be serialized also when ftrace_graph_caller gets patched.
This is missing since ftrace function graph support was added on s390.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-11 20:55:58 +02:00
Heiko Carstens
ae2b9a11b4 s390/ftrace: use text_poke_sync_lock()
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-11 20:55:58 +02:00
Heiko Carstens
1c27dfb24e s390/jump_label: use text_poke_sync()
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-11 20:55:58 +02:00
Heiko Carstens
e16d02ee3f s390: introduce text_poke_sync()
Introduce a text_poke_sync() similar to what x86 has. This can be
used to execute a serializing instruction on all CPUs (including
the current one).

Note: according to the Principles of Operation an IPI (= interrupt)
will already serialize a CPU, however it is better to be explicit. In
addition on_each_cpu() makes sure that also the current CPU get
serialized - just to make sure that possible preemption can prevent
some theoretical case where a CPU will not be serialized.

Therefore text_poke_sync() has to be used whenever code got modified,
just to avoid to rely on implicit serialization.

Also introduce text_poke_sync_lock() which will also disable CPU
hotplug, to prevent that any CPU is just going online with a
prefetched old version of a modified instruction.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-11 20:55:58 +02:00
Weizhao Ouyang
6644c654ea ftrace: Cleanup ftrace_dyn_arch_init()
Most of ARCHs use empty ftrace_dyn_arch_init(), introduce a weak common
ftrace_dyn_arch_init() to cleanup them.

Link: https://lkml.kernel.org/r/20210909090216.1955240-1-o451686892@gmail.com

Acked-by: Heiko Carstens <hca@linux.ibm.com> (s390)
Acked-by: Helge Deller <deller@gmx.de> (parisc)
Signed-off-by: Weizhao Ouyang <o451686892@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-10-08 19:41:39 -04:00
Alexander Gordeev
e3ec8e0f57 s390/boot: allocate amode31 section in decompressor
The memory for amode31 section is allocated from the decompressed
kernel. Instead, allocate that memory from the decompressor. This
is a prerequisite to allow initialization of the virtual memory
before the decompressed kernel takes over.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-04 09:49:37 +02:00
Alexander Gordeev
584315ed87 s390/boot: initialize control registers in decompressor
Partially revert commit 4555b9f34296 ("s390/boot: move
dma sections from decompressor to decompressed kernel").
This is a prerequisite to allow initialization of virtual
memory in decompressor and avoid overwriting of ASCEs in
the decompressed kernel otherwise.

Since the control registers 2, 5 and 15 are reinitialized
in the decompressed kernel again, this change does not
prevent relocating of amode31 section in any way.

Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-04 09:49:37 +02:00
Sven Schnelle
4df898dc06 s390/kprobes: add sanity check
Check whether the specified address points to the start of an
instruction to prevent users from setting a kprobe in the mid of
an instruction which would crash the kernel.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-04 09:49:36 +02:00
Heiko Carstens
b860b9346e s390/ftrace: remove dead code
ftrace_shared_hotpatch_trampoline() never returns NULL,
therefore quite a bit of code can be removed.

Acked-by: Ilya Leoshkevich <iii@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-10-04 09:49:36 +02:00
Richard Guy Briggs
1c30e3af8a audit: add support for the openat2 syscall
The openat2(2) syscall was added in kernel v5.6 with commit
fddb5d430ad9 ("open: introduce openat2(2) syscall").

Add the openat2(2) syscall to the audit syscall classifier.

Link: https://github.com/linux-audit/audit-kernel/issues/67
Link: https://lore.kernel.org/r/f5f1a4d8699613f8c02ce762807228c841c2e26f.1621363275.git.rgb@redhat.com
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
[PM: merge fuzz due to previous header rename, commit line wraps]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-01 16:52:48 -04:00
Richard Guy Briggs
42f355ef59 audit: replace magic audit syscall class numbers with macros
Replace audit syscall class magic numbers with macros.

This required putting the macros into new header file
include/linux/audit_arch.h since the syscall macros were
included for both 64 bit and 32 bit in any compat code, causing
redefinition warnings.

Link: https://lore.kernel.org/r/2300b1083a32aade7ae7efb95826e8f3f260b1df.1621363275.git.rgb@redhat.com
Signed-off-by: Richard Guy Briggs <rgb@redhat.com>
Acked-by: Christian Brauner <christian.brauner@ubuntu.com>
[PM: renamed header to audit_arch.h after consulting with Richard]
Signed-off-by: Paul Moore <paul@paul-moore.com>
2021-10-01 16:41:33 -04:00
Masami Hiramatsu
adf8a61a94 kprobes: treewide: Make it harder to refer kretprobe_trampoline directly
Since now there is kretprobe_trampoline_addr() for referring the
address of kretprobe trampoline code, we don't need to access
kretprobe_trampoline directly.

Make it harder to refer by renaming it to __kretprobe_trampoline().

Link: https://lkml.kernel.org/r/163163045446.489837.14510577516938803097.stgit@devnote2

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-30 21:24:06 -04:00
Masami Hiramatsu
96fed8ac2b kprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler()
The __kretprobe_trampoline_handler() callback, called from low level
arch kprobes methods, has the 'trampoline_address' parameter, which is
entirely superfluous as it basically just replicates:

  dereference_kernel_function_descriptor(kretprobe_trampoline)

In fact we had bugs in arch code where it wasn't replicated correctly.

So remove this superfluous parameter and use kretprobe_trampoline_addr()
instead.

Link: https://lkml.kernel.org/r/163163044546.489837.13505751885476015002.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-30 21:24:06 -04:00
Masami Hiramatsu
9c89bb8e32 kprobes: treewide: Cleanup the error messages for kprobes
This clean up the error/notification messages in kprobes related code.
Basically this defines 'pr_fmt()' macros for each files and update
the messages which describes

 - what happened,
 - what is the kernel going to do or not do,
 - is the kernel fine,
 - what can the user do about it.

Also, if the message is not needed (e.g. the function returns unique
error code, or other error message is already shown.) remove it,
and replace the message with WARN_*() macros if suitable.

Link: https://lkml.kernel.org/r/163163036568.489837.14085396178727185469.stgit@devnote2

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
2021-09-30 21:24:05 -04:00
Linus Torvalds
f154c80667 2nd batch of s390 updates for 5.15 merge window
- Fix topology update on cpu hotplug, so notifiers see expected masks. This bug
   was uncovered with SCHED_CORE support.
 
 - Fix stack unwinding so that the correct number of entries are omitted like
   expected by common code. This fixes KCSAN selftests.
 
 - Add kmemleak annotation to stack_alloc to avoid false positive kmemleak
   warnings.
 
 - Avoid layering violation in common I/O code and don't unregister subchannel
   from child-drivers.
 
 - Remove xpram device driver for which no real use case exists since the kernel
   is 64 bit only. Also all hypervisors got required support removed in the
   meantime, which means the xpram device driver is dead code.
 
 - Fix -ENODEV handling of clp_get_state in our PCI code.
 
 - Enable KFENCE in debug defconfig.
 
 - Cleanup hugetlbfs s390 specific Kconfig dependency.
 
 - Quite a lot of trivial fixes to get rid of "W=1" warnings, and and other
   simple cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmE56jEACgkQIg7DeRsp
 bsI1sQ/+L91zvpjlWGEPjZhQmFJgDufuObLWJlhwOSPsOlezzJTujNscoisTe6Wm
 hfS1I/GzGsgcY3695xgBLgkPS37nrDdDLAgM4CnajOOalEZjbHgH5gcPiCPHfPAD
 QkvVFv2PjCQnaPx81kEIeK6tMFkvi6IRhfwhtGTf1fwoKDyw4IQT1couBsiuAy3n
 28/7NqMidS4gbv5X/BLK1Ez4as9d3PoecNre1debRPOZcdxIjCVDy7OW5MotI3ol
 ENsOHtNJe/orIDCc+QbsEP2xZJZdbZ0D0Zr/RQ4KEue42wKtGLzp/ZuG+UfTPyyx
 vlEDgMRgPHAGnceEImcMwK0XQwOn05sm13jOkbmpIwhmiE46rksAPf3cGL4DjlBP
 3rznDXoLYELX2OAHz2G4jfbrqFWDxbh5rp1NMr8tELvJV5xbdsMC11QFQY28swod
 /sUE39fX+zynwHSSttq0PXtKX4gr/d5ZMDdlhjl7lxlOgwEwDodBL3/xL81+C0qx
 jkQWDsJ6OpZ7iJpGvxaCUhFjlgihdi2InZ942inRGo/A/EaM6/7diExLiyqfaab5
 WEQ2BOlITUey85Fiu2WxeeweRChUwu+XNQt+Nx4hDF454K51htU/GJCUBW5Z5qtN
 Dm+/DolXkPY+joR7xBLHNzivob3ShcsoFiZjoBpTc/Hd18dhSQg=
 =fpJz
 -----END PGP SIGNATURE-----

Merge tag 's390-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull more s390 updates from Heiko Carstens:
 "Except for the xpram device driver removal it is all about fixes and
  cleanups.

   - Fix topology update on cpu hotplug, so notifiers see expected
     masks. This bug was uncovered with SCHED_CORE support.

   - Fix stack unwinding so that the correct number of entries are
     omitted like expected by common code. This fixes KCSAN selftests.

   - Add kmemleak annotation to stack_alloc to avoid false positive
     kmemleak warnings.

   - Avoid layering violation in common I/O code and don't unregister
     subchannel from child-drivers.

   - Remove xpram device driver for which no real use case exists since
     the kernel is 64 bit only. Also all hypervisors got required
     support removed in the meantime, which means the xpram device
     driver is dead code.

   - Fix -ENODEV handling of clp_get_state in our PCI code.

   - Enable KFENCE in debug defconfig.

   - Cleanup hugetlbfs s390 specific Kconfig dependency.

   - Quite a lot of trivial fixes to get rid of "W=1" warnings, and and
     other simple cleanups"

* tag 's390-5.15-2' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  hugetlbfs: s390 is always 64bit
  s390/ftrace: remove incorrect __va usage
  s390/zcrypt: remove incorrect kernel doc indicators
  scsi: zfcp: fix kernel doc comments
  s390/sclp: add __nonstring annotation
  s390/hmcdrv_ftp: fix kernel doc comment
  s390: remove xpram device driver
  s390/pci: read clp_list_pci_req only once
  s390/pci: fix clp_get_state() handling of -ENODEV
  s390/cio: fix kernel doc comment
  s390/ctrlchar: fix kernel doc comment
  s390/con3270: use proper type for tasklet function
  s390/cpum_cf: move array from header to C file
  s390/mm: fix kernel doc comments
  s390/topology: fix topology information when calling cpu hotplug notifiers
  s390/unwind: use current_frame_address() to unwind current task
  s390/configs: enable CONFIG_KFENCE in debug_defconfig
  s390/entry: make oklabel within CHKSTG macro local
  s390: add kmemleak annotation in stack_alloc()
  s390/cio: dont unregister subchannel from child-drivers
2021-09-09 12:55:12 -07:00
Arnd Bergmann
59ab844eed compat: remove some compat entry points
These are all handled correctly when calling the native system call entry
point, so remove the special cases.

Link: https://lkml.kernel.org/r/20210727144859.4150043-6-arnd@kernel.org
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Al Viro <viro@zeniv.linux.org.uk>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Feng Tang <feng.tang@intel.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: "James E.J. Bottomley" <James.Bottomley@HansenPartnership.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-08 15:32:35 -07:00
Heiko Carstens
9652cb805c s390/ftrace: remove incorrect __va usage
The address of ftrace_graph_caller is already virtual.
Using __va() to translate the address is wrong.

Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-09-08 14:23:31 +02:00
Heiko Carstens
5dddfaac4c s390/cpum_cf: move array from header to C file
Move array from header to C file to avoid that it gets defined in
every C file where the header is included:

In file included from arch/s390/kernel/perf_cpum_cf_common.c:19:
./arch/s390/include/asm/cpu_mcf.h:27:18: warning: ‘cpumf_ctr_ctl’ defined but not used [-Wunused-const-variable=]
   27 | static const u64 cpumf_ctr_ctl[CPUMF_CTR_SET_MAX] = {
      |                  ^~~~~~~~~~~~~

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-09-07 13:38:41 +02:00
Sven Schnelle
a052096bdd s390/topology: fix topology information when calling cpu hotplug notifiers
The cpu hotplug notifiers are called without updating the core/thread
masks when a new CPU is added. This causes problems with code setting
up data structures in a cpu hotplug notifier, and relying on that later
in normal code.

This caused a crash in the new core scheduling code (SCHED_CORE),
where rq->core was set up in a notifier depending on cpu masks.

To fix this, add a cpu_setup_mask which is used in update_cpu_masks()
instead of the cpu_online_mask to determine whether the cpu masks should
be set for a certain cpu. Also move update_cpu_masks() to update the
masks before calling notify_cpu_starting() so that the notifiers are
seeing the updated masks.

Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Cc: <stable@vger.kernel.org>
[hca@linux.ibm.com: get rid of cpu_online_mask handling]
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-09-07 13:38:41 +02:00
Linus Torvalds
14726903c8 Merge branch 'akpm' (patches from Andrew)
Merge misc updates from Andrew Morton:
 "173 patches.

  Subsystems affected by this series: ia64, ocfs2, block, and mm (debug,
  pagecache, gup, swap, shmem, memcg, selftests, pagemap, mremap,
  bootmem, sparsemem, vmalloc, kasan, pagealloc, memory-failure,
  hugetlb, userfaultfd, vmscan, compaction, mempolicy, memblock,
  oom-kill, migration, ksm, percpu, vmstat, and madvise)"

* emailed patches from Andrew Morton <akpm@linux-foundation.org>: (173 commits)
  mm/madvise: add MADV_WILLNEED to process_madvise()
  mm/vmstat: remove unneeded return value
  mm/vmstat: simplify the array size calculation
  mm/vmstat: correct some wrong comments
  mm/percpu,c: remove obsolete comments of pcpu_chunk_populated()
  selftests: vm: add COW time test for KSM pages
  selftests: vm: add KSM merging time test
  mm: KSM: fix data type
  selftests: vm: add KSM merging across nodes test
  selftests: vm: add KSM zero page merging test
  selftests: vm: add KSM unmerge test
  selftests: vm: add KSM merge test
  mm/migrate: correct kernel-doc notation
  mm: wire up syscall process_mrelease
  mm: introduce process_mrelease system call
  memblock: make memblock_find_in_range method private
  mm/mempolicy.c: use in_task() in mempolicy_slab_node()
  mm/mempolicy: unify the create() func for bind/interleave/prefer-many policies
  mm/mempolicy: advertise new MPOL_PREFERRED_MANY
  mm/hugetlb: add support for mempolicy MPOL_PREFERRED_MANY
  ...
2021-09-03 10:08:28 -07:00
Suren Baghdasaryan
dce4910396 mm: wire up syscall process_mrelease
Split off from prev patch in the series that implements the syscall.

Link: https://lkml.kernel.org/r/20210809185259.405936-2-surenb@google.com
Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: David Hildenbrand <david@redhat.com>
Cc: David Rientjes <rientjes@google.com>
Cc: Florian Weimer <fweimer@redhat.com>
Cc: Jan Engelhardt <jengelh@inai.de>
Cc: Jann Horn <jannh@google.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Minchan Kim <minchan@kernel.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Rik van Riel <riel@surriel.com>
Cc: Roman Gushchin <guro@fb.com>
Cc: Shakeel Butt <shakeelb@google.com>
Cc: Tim Murray <timmurray@google.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-03 09:58:17 -07:00
Mike Rapoport
a7259df767 memblock: make memblock_find_in_range method private
There are a lot of uses of memblock_find_in_range() along with
memblock_reserve() from the times memblock allocation APIs did not exist.

memblock_find_in_range() is the very core of memblock allocations, so any
future changes to its internal behaviour would mandate updates of all the
users outside memblock.

Replace the calls to memblock_find_in_range() with an equivalent calls to
memblock_phys_alloc() and memblock_phys_alloc_range() and make
memblock_find_in_range() private method of memblock.

This simplifies the callers, ensures that (unlikely) errors in
memblock_reserve() are handled and improves maintainability of
memblock_find_in_range().

Link: https://lkml.kernel.org/r/20210816122622.30279-1-rppt@kernel.org
Signed-off-by: Mike Rapoport <rppt@linux.ibm.com>
Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>		[arm64]
Acked-by: Kirill A. Shutemov <kirill.shtuemov@linux.intel.com>
Acked-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>	[ACPI]
Acked-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>
Acked-by: Nick Kossifidis <mick@ics.forth.gr>			[riscv]
Tested-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Rob Herring <robh@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2021-09-03 09:58:17 -07:00
Linus Torvalds
bcfeebbff3 Merge branch 'exit-cleanups-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace
Pull exit cleanups from Eric Biederman:
 "In preparation of doing something about PTRACE_EVENT_EXIT I have
  started cleaning up various pieces of code related to do_exit. Most of
  that code I did not manage to get tested and reviewed before the merge
  window opened but a handful of very useful cleanups are ready to be
  merged.

  The first change is simply the removal of the bdflush system call. The
  code has now been disabled long enough that even the oldest userspace
  working userspace setups anyone can find to test are fine with the
  bdflush system call being removed.

  Changing m68k fsp040_die to use force_sigsegv(SIGSEGV) instead of
  calling do_exit directly is interesting only in that it is nearly the
  most difficult of the incorrect uses of do_exit to remove.

  The change to the seccomp code to simply send a signal instead of
  calling do_coredump directly is a very nice little cleanup made
  possible by realizing the existing signal sending helpers were missing
  a little bit of functionality that is easy to provide"

* 'exit-cleanups-for-v5.15' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace:
  signal/seccomp: Dump core when there is only one live thread
  signal/seccomp: Refactor seccomp signal and coredump generation
  signal/m68k: Use force_sigsegv(SIGSEGV) in fpsp040_die
  exit/bdflush: Remove the deprecated bdflush system call
2021-09-01 14:52:05 -07:00
Heiko Carstens
15256194ef s390/entry: make oklabel within CHKSTG macro local
Make the oklabel within the CHKSTG macro local. This makes sure that
tools like objdump and the crash debugging tool still disassemble full
functions where the macro has been used instead of stopping half way
where such a global label is used and one has to guess how to
disassemble the rest of such a function:

E.g.:

0000000000cb0270 <mcck_int_handler>:
  cb0270:       b2 05 03 20             stck    800
  ...
  cb0354:       a7 74 00 97             jne     cb0482 <oklabel270+0xe2>

0000000000cb0358 <oklabel243>:
  cb0358:       c0 e0 00 22 4e 8f       larl    %r14,10fa076 <opcode+0x2558>
  ...

Fixes: d35925b34996 ("s390/mcck: move storage error checks to assembler")
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-31 14:54:15 +02:00
Sven Schnelle
436fc4feea s390: add kmemleak annotation in stack_alloc()
kmemleak with enabled auto scanning reports that our stack allocation is
lost. This is because we're saving the pointer + STACK_INIT_OFFSET to
lowcore. When kmemleak now scans the objects, it thinks that this one is
lost because it can't find a corresponding pointer.

Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Tested-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-31 14:54:14 +02:00
Linus Torvalds
c7a5238ef6 s390 updates for 5.15 merge window
- Improve ftrace code patching so that stop_machine is not required anymore.
   This requires a small common code patch acked by Steven Rostedt:
   https://lore.kernel.org/linux-s390/20210730220741.4da6fdf6@oasis.local.home/
 
 - Enable KCSAN for s390. This comes with a small common code change to fix a
   compile warning. Acked by Marco Elver:
   https://lore.kernel.org/r/20210729142811.1309391-1-hca@linux.ibm.com
 
 - Add KFENCE support for s390. This also comes with a minimal x86 patch from
   Marco Elver who said also this can be carried via the s390 tree:
   https://lore.kernel.org/linux-s390/YQJdarx6XSUQ1tFZ@elver.google.com/
 
 - More changes to prepare the decompressor for relocation.
 
 - Enable DAT also for CPU restart path.
 
 - Final set of register asm removal patches; leaving only three locations where
   needed and sane.
 
 - Add NNPA, Vector-Packed-Decimal-Enhancement Facility 2, PCI MIO support to
   hwcaps flags.
 
 - Cleanup hwcaps implementation.
 
 - Add new instructions to in-kernel disassembler.
 
 - Various QDIO cleanups.
 
 - Add SCLP debug feature.
 
 - Various other cleanups and improvements all over the place.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEECMNfWEw3SLnmiLkZIg7DeRspbsIFAmEs1GsACgkQIg7DeRsp
 bsJ9+A/9FApCECNPgu6jOX4Ee+no+LxpCPUF8rvt56TFTLv7+Dhm7fJl0xQ9utsZ
 FyLMDAr1/FKdm2wBW23QZH4vEIt1bd6e/03DwwK+6IjHKZHRIfB8eGJMsLj/TDzm
 K6/+FI7qXjvpNXxgkCqXf5yESi/y5Dgr+16kTBhPZj5awRiwe5puPamji3uiQ45V
 r4MdGCCC9BnTZvtPpUrr8ImnUqHJ4/TMo1YYdykLbZFuAvvYUyZ5YG5kh0pMa8JZ
 DGJpfLQfy7ZNscIzdVhZtfzzESVtS6/AOeBzDMO1pbM1CGXtvpJJP0Wjlr/PGwoW
 fvuMHpqTlDi+TfNZiPP5lwsFC89xSd6gtZH7vAuI8kFCXgW3RMjABF6h/mzpH1WO
 jXyaSEZROc/83gxPMYyOYiqrKyAFPbpZ/Rnav2bvGQGneqx7RvmpF3GgA9WEo1PW
 rMDoEbLstJuHk0E2uEV+OnQd5F7MHNonzpYfp/7pyQ+PL8w2GExV9yngVc/f3TqG
 HYLC9rc3K6DkxZappcJm0qTb7lDTMFI7xK3g9RiqPQBJE1v1MYE/rai48nW69ypE
 bRNL76AjyXKo+zKR2wlhJVMY1I1+DarMopHhZj6fzQT5te1LLsv8OuTU2gkt6dIq
 YoSYOYvModf3HbKnJul2tszQG9yl+vpE9MiCyBQSsxIYXCriq/c=
 =WDRh
 -----END PGP SIGNATURE-----

Merge tag 's390-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux

Pull s390 updates from Heiko Carstens:

 - Improve ftrace code patching so that stop_machine is not required
   anymore. This requires a small common code patch acked by Steven
   Rostedt:

     https://lore.kernel.org/linux-s390/20210730220741.4da6fdf6@oasis.local.home/

 - Enable KCSAN for s390. This comes with a small common code change to
   fix a compile warning. Acked by Marco Elver:

     https://lore.kernel.org/r/20210729142811.1309391-1-hca@linux.ibm.com

 - Add KFENCE support for s390. This also comes with a minimal x86 patch
   from Marco Elver who said also this can be carried via the s390 tree:

     https://lore.kernel.org/linux-s390/YQJdarx6XSUQ1tFZ@elver.google.com/

 - More changes to prepare the decompressor for relocation.

 - Enable DAT also for CPU restart path.

 - Final set of register asm removal patches; leaving only three
   locations where needed and sane.

 - Add NNPA, Vector-Packed-Decimal-Enhancement Facility 2, PCI MIO
   support to hwcaps flags.

 - Cleanup hwcaps implementation.

 - Add new instructions to in-kernel disassembler.

 - Various QDIO cleanups.

 - Add SCLP debug feature.

 - Various other cleanups and improvements all over the place.

* tag 's390-5.15-1' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (105 commits)
  s390: remove SCHED_CORE from defconfigs
  s390/smp: do not use nodat_stack for secondary CPU start
  s390/smp: enable DAT before CPU restart callback is called
  s390: update defconfigs
  s390/ap: fix state machine hang after failure to enable irq
  KVM: s390: generate kvm hypercall functions
  s390/sclp: add tracing of SCLP interactions
  s390/debug: add early tracing support
  s390/debug: fix debug area life cycle
  s390/debug: keep debug data on resize
  s390/diag: make restart_part2 a local label
  s390/mm,pageattr: fix walk_pte_level() early exit
  s390: fix typo in linker script
  s390: remove do_signal() prototype and do_notify_resume() function
  s390/crypto: fix all kernel-doc warnings in vfio_ap_ops.c
  s390/pci: improve DMA translation init and exit
  s390/pci: simplify CLP List PCI handling
  s390/pci: handle FH state mismatch only on disable
  s390/pci: fix misleading rc in clp_set_pci_fn()
  s390/boot: factor out offset_vmlinux_info() function
  ...
2021-08-30 13:07:15 -07:00
Alexander Gordeev
d6be5d0ad3 s390/smp: do not use nodat_stack for secondary CPU start
The secondary CPU start C routine uses nodat_stack as a
interim stack before finally switching to kernel_stack.
Such scheme is superfluous, since the assembler restart
interrupt handler (that secondary CPU starter is called
from) does not need to use any stack for switching into
DAT mode. Once DAT is on, any stack including virtually-
mapped one could be used.

Avoid the use of nodat_stack and smp_start_secondary()
helper. Instead, initiate kernel_stack directly from
the restart interrupt handler.

Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-26 20:22:13 +02:00
Alexander Gordeev
915fea04f9 s390/smp: enable DAT before CPU restart callback is called
The restart interrupt is triggered whenever a secondary CPU is
brought online, a remote function call dispatched from another
CPU or a manual PSW restart is initiated and causes the system
to kdump. The handling routine is always called with DAT turned
off. It then initializes the stack frame and invokes a callback.

The existing callbacks handle DAT as follows:

  * __do_restart() and __machine_kexec() turn in on upon entry;
  * __ipl_run(), __reipl_run() and __dump_run() do not turn it
    right away, but all of them call diag308() - which turns DAT
    on, but only if kasan is enabled;

In addition to the described complexity all callbacks (and the
functions they call) should avoid kasan instrumentation while
DAT is off.

This update enables DAT in the assembler restart handler and
relieves any callbacks (which are mostly C functions) from
dealing with DAT altogether.

There are four types of CPU restart that initialize control
registers in different ways:

  1. Start of secondary CPU on boot - control registers are
     inherited from the IPL CPU;
  2. Restart of online CPU - control registers of the CPU being
     restarted are kept;
  3. Hotplug of offline CPU - control registers are inherited
     from the starting CPU;
  4. Start of offline CPU triggered by manual PSW restart -
     the control registers are read from the absolute lowcore
     and contain the boot time IPL CPU values updated with all
     follow-up calls of smp_ctl_set_bit() and smp_ctl_clear_bit()
     routines;

In first three cases contents of the control registers is the
most recent. In the latter case control registers are good
enough to facilitate successful completion of kdump operation.

Suggested-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-26 20:22:12 +02:00
Peter Oberparleiter
70aa5d3982 s390/sclp: add tracing of SCLP interactions
Add tracing of interactions between the SCLP base driver, firmware and
other drivers to support problem determination in case of SCLP-related
issues.

For that purpose this patch introduces two new s390dbf debug areas:

  - sclp: An abbreviated log of all common interactions
  - sclp_err: A full log of failed or abnormal interactions

Tracing of full SCCB contents can be enabled for the sclp area by
setting its debug level to maximum (6).

Overview of added trace events:

  * Firmware interaction:
    - SRV1: Service call about to be issued
    - SRV2: Service call was issued
    - INT:  Interrupt received

  * Driver interaction:
    - RQAD: Request was added
    - RQOK: Request success
    - RQAB: Request aborted
    - RQTM: Request timed out
    - REG:  Event listener registered
    - UREG: Event listener unregistered
    - EVNT: Event callback
    - STCG: State-change callback

  * Abnormal events:
    - TMO:  A timeout occurred
    - UNEX: Unexpected SCCB completion

  * Other (not traced at default level):
    - SYN1: Synchronous wait start
    - SYN2: Synchronous wait end

Since the SCLP interface is used by console drivers this patch also
moves s390dbf printks outside the critical section protected by debug
area locks to prevent a potential deadlock that would otherwise be
introduced between console_owner --> sclp_lock --> sclp_debug.lock.

Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25 11:03:35 +02:00
Peter Oberparleiter
d72541f945 s390/debug: add early tracing support
Debug areas can currently only be used after s390dbf initialization
which occurs as a postcore_initcall. This is too late for tracing
earlier code such as that related to console_init().

This patch introduces a macro for defining a statically initialized
debug area that can be used to trace very early code. The macro is made
available for built-in code only because modules are never running
during early boot.

Example usage:

1. Define static debug area:

  DEFINE_STATIC_DEBUG_INFO(my_debug, "my_debug", 4, 1, 16,
			   &debug_hex_ascii_view);

2. Add trace entry:

  debug_event(&my_debug, 0, "DATA", 4);

Note: The debug area is automatically registered in debugfs during boot.
      A driver must not call any of the debug_register()/_unregister()
      functions on a static debug_info_t!

Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25 11:03:35 +02:00
Peter Oberparleiter
9372a82892 s390/debug: fix debug area life cycle
Currently allocation and registration of s390dbf debug areas are tied
together. As a result, a debug area cannot be unregistered and
re-registered while any process has an associated debugfs file open.

Fix this by splitting alloc/release from register/unregister.

Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25 11:03:35 +02:00
Peter Oberparleiter
1204777867 s390/debug: keep debug data on resize
Any previously recorded s390dbf debug data is reset when a debug area
is resized using the 'pages' sysfs attribute. This can make
live-debugging unnecessarily complex.

Fix this by copying existing debug data to the newly allocated debug
area when resizing.

Signed-off-by: Peter Oberparleiter <oberpar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25 11:03:35 +02:00
Heiko Carstens
2879048c7e s390/diag: make restart_part2 a local label
Avoid that the "restart_part2" label, which is in the middle of a
function, appears in /proc/kallsyms.

Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25 11:03:34 +02:00
Heiko Carstens
8b5f08b484 s390: fix typo in linker script
Rename amod31 to amode31 like it was supposed to be.

Fixes: c78d0c7484f0 ("s390: rename dma section to amode31")
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25 11:03:34 +02:00
Sven Schnelle
28be5743c6 s390: remove do_signal() prototype and do_notify_resume() function
Both are no longer used since the conversion to generic
entry, therefore remove them.

Fixes: 56e62a737028 ("s390: convert to generic entry")
Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
Reviewed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-25 11:03:34 +02:00
Alexander Egorenkov
7c0eaa78b9 s390/sclp: reserve memory occupied by sclp early buffer
The memory block occupied by the SCLP early buffer that is allocated
by the decompressor and then handed over to the decompressed kernel,
must be reserved to prevent it from being reused for other purposes.
This is necessary because the SCLP early buffer is still in use
during kernel initialization.

Fixes: f1d3c5323772 ("s390/boot: move sclp early buffer from fixed address in asm to C")
Signed-off-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Reported-by: Alexander Gordeev <agordeev@linux.ibm.com>
Reviewed-by: Alexander Gordeev <agordeev@linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
2021-08-18 10:01:29 +02:00
Heiko Carstens
c78d0c7484 s390: rename dma section to amode31
The dma section name is confusing, since the code which resides within
that section has nothing to do with direct memory access.  Instead the
limitation is that the code has to run in 31 bit addressing mode, and
therefore has to reside below 2GB.  So the name was chosen since
ZONE_DMA is the same region.

To reduce confusion rename the section to amode31, which hopefully
describes better what this is about.

Note: this will also change vmcoreinfo strings
- SDMA=... gets renamed to SAMODE31=...
- EDMA=... gets renamed to EAMODE31=...

Acked-by: Vasily Gorbik <gor@linux.ibm.com>
Reviewed-by: Alexander Egorenkov <egorenar@linux.ibm.com>
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-05 14:10:53 +02:00
Sebastian Andrzej Siewior
a73de29320 s390: replace deprecated CPU-hotplug functions
The functions get_online_cpus() and put_online_cpus() have been
deprecated during the CPU hotplug rework. They map directly to
cpus_read_lock() and cpus_read_unlock().

Replace deprecated CPU-hotplug functions with the official version.
The behavior remains unchanged.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/20210803141621.780504-5-bigeasy@linutronix.de
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-05 14:10:53 +02:00
Ilya Leoshkevich
de5012b41e s390/ftrace: implement hotpatching
s390 allows hotpatching the mask of a conditional jump instruction.
Make use of this feature in order to avoid the expensive stop_machine()
call.

The new trampolines are split in 3 stages:

- A first stage is a 6-byte relative conditional long branch located at
  each function's entry point. Its offset always points to the second
  stage for the corresponding function, and its mask is either all 0s
  (ftrace off) or all 1s (ftrace on). The code for flipping the mask is
  borrowed from ftrace_{enable,disable}_ftrace_graph_caller. After
  flipping, ftrace_arch_code_modify_post_process() syncs with all the
  other CPUs by sending SIGPs.

- Second stages for vmlinux are stored in a separate part of the .text
  section reserved by the linker script, and in dynamically allocated
  memory for modules. This prevents the icache pollution. The total
  size of second stages is about 1.5% of that of the kernel image.

  Putting second stages in the .bss section is possible and decreases
  the size of the non-compressed vmlinux, but splits the kernel 1:1
  mapping, which is a bad tradeoff.

  Each second stage contains a call to the third stage, a pointer to
  the part of the intercepted function right after the first stage, and
  a pointer to an interceptor function (e.g. ftrace_caller).

  Second stages are 8-byte aligned for the future direct calls
  implementation.

- There are only two copies of the third stage: in the .text section
  for vmlinux and in dynamically allocated memory for modules. It can be
  an expoline, which is relatively large, so inlining it into each
  second stage is prohibitively expensive.

As a result of this organization, phoronix-test-suite with ftrace off
does not show any performance degradation.

Suggested-by: Sven Schnelle <svens@linux.ibm.com>
Suggested-by: Vasily Gorbik <gor@linux.ibm.com>
Co-developed-by: Heiko Carstens <hca@linux.ibm.com>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Link: https://lore.kernel.org/r/20210728212546.128248-3-iii@linux.ibm.com
Signed-off-by: Heiko Carstens <hca@linux.ibm.com>
2021-08-03 14:31:40 +02:00