Commit Graph

226 Commits

Author SHA1 Message Date
Linus Torvalds
8e9a2dba86 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull core locking updates from Ingo Molnar:
 "The main changes in this cycle are:

   - Another attempt at enabling cross-release lockdep dependency
     tracking (automatically part of CONFIG_PROVE_LOCKING=y), this time
     with better performance and fewer false positives. (Byungchul Park)

   - Introduce lockdep_assert_irqs_enabled()/disabled() and convert
     open-coded equivalents to lockdep variants. (Frederic Weisbecker)

   - Add down_read_killable() and use it in the VFS's iterate_dir()
     method. (Kirill Tkhai)

   - Convert remaining uses of ACCESS_ONCE() to
     READ_ONCE()/WRITE_ONCE(). Most of the conversion was Coccinelle
     driven. (Mark Rutland, Paul E. McKenney)

   - Get rid of lockless_dereference(), by strengthening Alpha atomics,
     strengthening READ_ONCE() with smp_read_barrier_depends() and thus
     being able to convert users of lockless_dereference() to
     READ_ONCE(). (Will Deacon)

   - Various micro-optimizations:

        - better PV qspinlocks (Waiman Long),
        - better x86 barriers (Michael S. Tsirkin)
        - better x86 refcounts (Kees Cook)

   - ... plus other fixes and enhancements. (Borislav Petkov, Juergen
     Gross, Miguel Bernal Marin)"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
  locking/x86: Use LOCK ADD for smp_mb() instead of MFENCE
  rcu: Use lockdep to assert IRQs are disabled/enabled
  netpoll: Use lockdep to assert IRQs are disabled/enabled
  timers/posix-cpu-timers: Use lockdep to assert IRQs are disabled/enabled
  sched/clock, sched/cputime: Use lockdep to assert IRQs are disabled/enabled
  irq_work: Use lockdep to assert IRQs are disabled/enabled
  irq/timings: Use lockdep to assert IRQs are disabled/enabled
  perf/core: Use lockdep to assert IRQs are disabled/enabled
  x86: Use lockdep to assert IRQs are disabled/enabled
  smp/core: Use lockdep to assert IRQs are disabled/enabled
  timers/hrtimer: Use lockdep to assert IRQs are disabled/enabled
  timers/nohz: Use lockdep to assert IRQs are disabled/enabled
  workqueue: Use lockdep to assert IRQs are disabled/enabled
  irq/softirqs: Use lockdep to assert IRQs are disabled/enabled
  locking/lockdep: Add IRQs disabled/enabled assertion APIs: lockdep_assert_irqs_enabled()/disabled()
  locking/pvqspinlock: Implement hybrid PV queued/unfair locks
  locking/rwlocks: Fix comments
  x86/paravirt: Set up the virt_spin_lock_key after static keys get initialized
  block, locking/lockdep: Assign a lock_class per gendisk used for wait_for_completion()
  workqueue: Remove now redundant lock acquisitions wrt. workqueue flushes
  ...
2017-11-13 12:38:26 -08:00
Linus Torvalds
d60a540ac5 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Heiko Carstens:
 "Since Martin is on vacation you get the s390 pull request for the
  v4.15 merge window this time from me.

  Besides a lot of cleanups and bug fixes these are the most important
  changes:

   - a new regset for runtime instrumentation registers

   - hardware accelerated AES-GCM support for the aes_s390 module

   - support for the new CEX6S crypto cards

   - support for FORTIFY_SOURCE

   - addition of missing z13 and new z14 instructions to the in-kernel
     disassembler

   - generate opcode tables for the in-kernel disassembler out of a
     simple text file instead of having to manually maintain those
     tables

   - fast memset16, memset32 and memset64 implementations

   - removal of named saved segment support

   - hardware counter support for z14

   - queued spinlocks and queued rwlocks implementations for s390

   - use the stack_depth tracking feature for s390 BPF JIT

   - a new s390_sthyi system call which emulates the sthyi (store
     hypervisor information) instruction

   - removal of the old KVM virtio transport

   - an s390 specific CPU alternatives implementation which is used in
     the new spinlock code"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (88 commits)
  MAINTAINERS: add virtio-ccw.h to virtio/s390 section
  s390/noexec: execute kexec datamover without DAT
  s390: fix transactional execution control register handling
  s390/bpf: take advantage of stack_depth tracking
  s390: simplify transactional execution elf hwcap handling
  s390/zcrypt: Rework struct ap_qact_ap_info.
  s390/virtio: remove unused header file kvm_virtio.h
  s390: avoid undefined behaviour
  s390/disassembler: generate opcode tables from text file
  s390/disassembler: remove insn_to_mnemonic()
  s390/dasd: avoid calling do_gettimeofday()
  s390: vfio-ccw: Do not attempt to free no-op, test and tic cda.
  s390: remove named saved segment support
  s390/archrandom: Reconsider s390 arch random implementation
  s390/pci: do not require AIS facility
  s390/qdio: sanitize put_indicator
  s390/qdio: use atomic_cmpxchg
  s390/nmi: avoid using long-displacement facility
  s390: pass endianness info to sparse
  s390/decompressor: remove informational messages
  ...
2017-11-13 11:47:01 -08:00
Ingo Molnar
8c5db92a70 Merge branch 'linus' into locking/core, to resolve conflicts
Conflicts:
	include/linux/compiler-clang.h
	include/linux/compiler-gcc.h
	include/linux/compiler-intel.h
	include/uapi/linux/stddef.h

Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-11-07 10:32:44 +01:00
Greg Kroah-Hartman
b24413180f License cleanup: add SPDX GPL-2.0 license identifier to files with no license
Many source files in the tree are missing licensing information, which
makes it harder for compliance tools to determine the correct license.

By default all files without license information are under the default
license of the kernel, which is GPL version 2.

Update the files which contain no license information with the 'GPL-2.0'
SPDX license identifier.  The SPDX identifier is a legally binding
shorthand, which can be used instead of the full boiler plate text.

This patch is based on work done by Thomas Gleixner and Kate Stewart and
Philippe Ombredanne.

How this work was done:

Patches were generated and checked against linux-4.14-rc6 for a subset of
the use cases:
 - file had no licensing information it it.
 - file was a */uapi/* one with no licensing information in it,
 - file was a */uapi/* one with existing licensing information,

Further patches will be generated in subsequent months to fix up cases
where non-standard license headers were used, and references to license
had to be inferred by heuristics based on keywords.

The analysis to determine which SPDX License Identifier to be applied to
a file was done in a spreadsheet of side by side results from of the
output of two independent scanners (ScanCode & Windriver) producing SPDX
tag:value files created by Philippe Ombredanne.  Philippe prepared the
base worksheet, and did an initial spot review of a few 1000 files.

The 4.13 kernel was the starting point of the analysis with 60,537 files
assessed.  Kate Stewart did a file by file comparison of the scanner
results in the spreadsheet to determine which SPDX license identifier(s)
to be applied to the file. She confirmed any determination that was not
immediately clear with lawyers working with the Linux Foundation.

Criteria used to select files for SPDX license identifier tagging was:
 - Files considered eligible had to be source code files.
 - Make and config files were included as candidates if they contained >5
   lines of source
 - File already had some variant of a license header in it (even if <5
   lines).

All documentation files were explicitly excluded.

The following heuristics were used to determine which SPDX license
identifiers to apply.

 - when both scanners couldn't find any license traces, file was
   considered to have no license information in it, and the top level
   COPYING file license applied.

   For non */uapi/* files that summary was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0                                              11139

   and resulted in the first patch in this series.

   If that file was a */uapi/* path one, it was "GPL-2.0 WITH
   Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:

   SPDX license identifier                            # files
   ---------------------------------------------------|-------
   GPL-2.0 WITH Linux-syscall-note                        930

   and resulted in the second patch in this series.

 - if a file had some form of licensing information in it, and was one
   of the */uapi/* ones, it was denoted with the Linux-syscall-note if
   any GPL family license was found in the file or had no licensing in
   it (per prior point).  Results summary:

   SPDX license identifier                            # files
   ---------------------------------------------------|------
   GPL-2.0 WITH Linux-syscall-note                       270
   GPL-2.0+ WITH Linux-syscall-note                      169
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
   ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
   LGPL-2.1+ WITH Linux-syscall-note                      15
   GPL-1.0+ WITH Linux-syscall-note                       14
   ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
   LGPL-2.0+ WITH Linux-syscall-note                       4
   LGPL-2.1 WITH Linux-syscall-note                        3
   ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
   ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1

   and that resulted in the third patch in this series.

 - when the two scanners agreed on the detected license(s), that became
   the concluded license(s).

 - when there was disagreement between the two scanners (one detected a
   license but the other didn't, or they both detected different
   licenses) a manual inspection of the file occurred.

 - In most cases a manual inspection of the information in the file
   resulted in a clear resolution of the license that should apply (and
   which scanner probably needed to revisit its heuristics).

 - When it was not immediately clear, the license identifier was
   confirmed with lawyers working with the Linux Foundation.

 - If there was any question as to the appropriate license identifier,
   the file was flagged for further research and to be revisited later
   in time.

In total, over 70 hours of logged manual review was done on the
spreadsheet to determine the SPDX license identifiers to apply to the
source files by Kate, Philippe, Thomas and, in some cases, confirmation
by lawyers working with the Linux Foundation.

Kate also obtained a third independent scan of the 4.13 code base from
FOSSology, and compared selected files where the other two scanners
disagreed against that SPDX file, to see if there was new insights.  The
Windriver scanner is based on an older version of FOSSology in part, so
they are related.

Thomas did random spot checks in about 500 files from the spreadsheets
for the uapi headers and agreed with SPDX license identifier in the
files he inspected. For the non-uapi files Thomas did random spot checks
in about 15000 files.

In initial set of patches against 4.14-rc6, 3 files were found to have
copy/paste license identifier errors, and have been fixed to reflect the
correct identifier.

Additionally Philippe spent 10 hours this week doing a detailed manual
inspection and review of the 12,461 patched files from the initial patch
version early this week with:
 - a full scancode scan run, collecting the matched texts, detected
   license ids and scores
 - reviewing anything where there was a license detected (about 500+
   files) to ensure that the applied SPDX license was correct
 - reviewing anything where there was no detection but the patch license
   was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
   SPDX license was correct

This produced a worksheet with 20 files needing minor correction.  This
worksheet was then exported into 3 different .csv files for the
different types of files to be modified.

These .csv files were then reviewed by Greg.  Thomas wrote a script to
parse the csv files and add the proper SPDX tag to the file, in the
format that the file expected.  This script was further refined by Greg
based on the output to detect more types of files automatically and to
distinguish between header and source .c files (which need different
comment types.)  Finally Greg ran the script using the .csv files to
generate the patches.

Reviewed-by: Kate Stewart <kstewart@linuxfoundation.org>
Reviewed-by: Philippe Ombredanne <pombredanne@nexb.com>
Reviewed-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2017-11-02 11:10:55 +01:00
Mark Rutland
6aa7de0591 locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE()
Please do not apply this to mainline directly, instead please re-run the
coccinelle script shown below and apply its output.

For several reasons, it is desirable to use {READ,WRITE}_ONCE() in
preference to ACCESS_ONCE(), and new code is expected to use one of the
former. So far, there's been no reason to change most existing uses of
ACCESS_ONCE(), as these aren't harmful, and changing them results in
churn.

However, for some features, the read/write distinction is critical to
correct operation. To distinguish these cases, separate read/write
accessors must be used. This patch migrates (most) remaining
ACCESS_ONCE() instances to {READ,WRITE}_ONCE(), using the following
coccinelle script:

----
// Convert trivial ACCESS_ONCE() uses to equivalent READ_ONCE() and
// WRITE_ONCE()

// $ make coccicheck COCCI=/home/mark/once.cocci SPFLAGS="--include-headers" MODE=patch

virtual patch

@ depends on patch @
expression E1, E2;
@@

- ACCESS_ONCE(E1) = E2
+ WRITE_ONCE(E1, E2)

@ depends on patch @
expression E;
@@

- ACCESS_ONCE(E)
+ READ_ONCE(E)
----

Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: davem@davemloft.net
Cc: linux-arch@vger.kernel.org
Cc: mpe@ellerman.id.au
Cc: shuah@kernel.org
Cc: snitzer@redhat.com
Cc: thor.thayer@linux.intel.com
Cc: tj@kernel.org
Cc: viro@zeniv.linux.org.uk
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/1508792849-3115-19-git-send-email-paulmck@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2017-10-25 11:01:08 +02:00
Vasily Gorbik
f554be42fd s390/spinlock: use cpu alternatives to enable niai instruction
Enable niai instruction in the spinlock code at run-time for machines
on which facility 49 is available (zEC12 and newer).

Signed-off-by: Vasily Gorbik <gor@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-18 14:11:33 +02:00
Heiko Carstens
49913f1fd0 s390: cleanup string ops prototypes
Just some trivial changes like removing the extern keyword from the
header file, renaming arguments to match the man pages, and whitespace
removal.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-09 11:18:08 +02:00
Heiko Carstens
993fef95b9 s390: optimize memset implementation
Like for the memset16/32/64 variants avoid that subsequent mvc
instructions depend on each other since that might have negative
performance impacts.

This patch is currently hardly relevant since at least gcc 7.1
generates only inline memset code and not a single memset call.
However there is no reason to not provide an optimized version
just in case gcc generates memset calls again, like it did in
the past.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-09 11:18:07 +02:00
Heiko Carstens
0b77d6701c s390: implement memset16, memset32 & memset64
Provide fast versions of the new memset variants. E.g. the generic
memset64 is ten times slower than the optimized version if used on a
whole page.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-10-09 11:18:04 +02:00
Martin Schwidefsky
eb3b7b848f s390/rwlock: introduce rwlock wait queueing
Like the common queued rwlock code the s390 implementation uses the
queued spinlock code on a spinlock_t embedded in the rwlock_t to achieve
the queueing. The encoding of the rwlock_t differs though, the counter
field in the rwlock_t is split into two parts. The upper two bytes hold
the write bit and the write wait counter, the lower two bytes hold the
read counter.

The arch_read_lock operation works exactly like the common qrwlock but
the enqueue operation for a writer follows a diffent logic. After the
failed inline try to get the rwlock in write, the writer first increases
the write wait counter, acquires the wait spin_lock for the queueing,
and then loops until there are no readers and the write bit is zero.
Without the write wait counter a CPU that just released the rwlock
could immediately reacquire the lock in the inline code, bypassing all
outstanding read and write waiters. For s390 this would cause massive
imbalances in favour of writers in case of a contended rwlock.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-28 07:29:44 +02:00
Martin Schwidefsky
b96f7d881a s390/spinlock: introduce spinlock wait queueing
The queued spinlock code for s390 follows the principles of the common
code qspinlock implementation but with a few notable differences.

The format of the spinlock_t locking word differs, s390 needs to store
the logical CPU number of the lock holder in the spinlock_t to be able
to use the diagnose 9c directed yield hypervisor call.

The inline code sequences for spin_lock and spin_unlock are nice and
short. The inline portion of a spin_lock now typically looks like this:

	lhi	%r0,0			# 0 indicates an empty lock
	l	%r1,0x3a0		# CPU number + 1 from lowcore
	cs	%r0,%r1,<some_lock>	# lock operation
	jnz	call_wait		# on failure call wait function
locked:
	...
call_wait:
	la	%r2,<some_lock>
	brasl	%r14,arch_spin_lock_wait
	j	locked

A spin_unlock is as simple as before:

	lhi	%r0,0
	sth	%r0,2(%r2)		# unlock operation

After a CPU has queued itself it may not enable interrupts again for the
arch_spin_lock_flags() variant. The arch_spin_lock_wait_flags wait function
is removed.

To improve performance the code implements opportunistic lock stealing.
If the wait function finds a spinlock_t that indicates that the lock is
free but there are queued waiters, the CPU may steal the lock up to three
times without queueing itself. The lock stealing update the steal counter
in the lock word to prevent more than 3 steals. The counter is reset at
the time the CPU next in the queue successfully takes the lock.

While the queued spinlocks improve performance in a system with dedicated
CPUs, in a virtualized environment with continuously overcommitted CPUs
the queued spinlocks can have a negative effect on performance. This
is due to the fact that a queued CPU that is preempted by the hypervisor
will block the queue at some point even without holding the lock. With
the classic spinlock it does not matter if a CPU is preempted that waits
for the lock. Therefore use the queued spinlock code only if the system
runs with dedicated CPUs and fall back to classic spinlocks when running
with shared CPUs.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-28 07:29:44 +02:00
Martin Schwidefsky
8153380379 s390/spinlock: use the cpu number +1 as spinlock value
The queued spinlock code will come out simpler if the encoding of
the CPU that holds the spinlock is (cpu+1) instead of (~cpu).

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-09-28 07:29:44 +02:00
Martin Schwidefsky
d66bf801e0 s390/uaccess: avoid mvcos jump label
If the kernel is compiled for z10 or later machines the uaccess
code inlines the mvcos instruction. The facility bit 27 which
indicates the availability of MVCOS has to be set. The have_mvcos
jump label will always be true.

Make the generation of the have_mvcos jump label conditional on
!CONFIG_HAVE_MARCH_Z10_FEATURES.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-08-29 16:29:10 +02:00
Martin Schwidefsky
7f7e6e28cd s390/spinlock: add niai spinlock hints
The z14 machine introduces new mode of the next-instruction-access-intent
NIAI instruction. With NIAI-8 it is possible to pin a cache-line on a
CPU for a small amount of time, NIAI-7 releases the cache-line again.
Finally NIAI-4 can be used to prevent the CPU to speculatively access
memory beyond the compare-and-swap instruction to get the lock.

Use these instruction in the spinlock code.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-26 08:25:23 +02:00
Martin Schwidefsky
6e2ef5e4f6 s390/time: add support for the TOD clock epoch extension
The TOD epoch extension adds 8 epoch bits to the TOD clock to provide
a continuous clock after 2042/09/17. The store-clock-extended (STCKE)
instruction will store the epoch index in the first byte of the
16 bytes stored by the instruction. The read_boot_clock64 and the
read_presistent_clock64 functions need to take the additional bits
into account to give the correct result after 2042/09/17.

The clock-comparator register will stay 64 bit wide. The comparison
of the clock-comparator with the TOD clock is limited to bytes
1 to 8 of the extended TOD format. To deal with the overflow problem
due to an epoch change the clock-comparator sign control in CR0 can
be used to switch the comparison of the 64-bit TOD clock with the
clock-comparator to a signed comparison.

The decision between the signed vs. unsigned clock-comparator
comparisons is done at boot time. Only if the TOD clock is in the
second half of a 142 year epoch the signed comparison is used.
This solves the epoch overflow issue as long as the machine is
booted at least once in an epoch.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-07-26 08:25:14 +02:00
Heiko Carstens
f5c8b96010 s390/uaccess: use sane length for __strncpy_from_user()
The average string that is copied from user space to kernel space is
rather short. E.g. booting a system involves about 50.000
strncpy_from_user() calls where the NULL terminated string has an
average size of 27 bytes.

By default our s390 specific strncpy_from_user() implementation
however copies up to 4096 bytes, which is a waste of cpu cycles and
cache lines. Therefore reduce the default length to L1_CACHE_BYTES
(256 bytes), which also reduces the average execution time of
strncpy_from_user() by 30-40%.

Alternatively we could have switched to the generic
strncpy_from_user() implementation, however it turned out that that
variant would be slower than the now optimized s390 variant.

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-05-09 10:43:24 +02:00
Heiko Carstens
db55947dd2 s390/uprobes: fix compile for !KPROBES
Fix the following compile error(s) if CONFIG_KPROBES is disabled:

arch/s390/kernel/uprobes.c:79:14:
 error: implicit declaration of function 'probe_get_fixup_type'
arch/s390/kernel/uprobes.c:87:14:
 error: 'FIXUP_PSW_NORMAL' undeclared (first use in this function)

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-05-03 09:08:57 +02:00
Linus Torvalds
b68e7e952f Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:

 - three merges for KVM/s390 with changes for vfio-ccw and cpacf. The
   patches are included in the KVM tree as well, let git sort it out.

 - add the new 'trng' random number generator

 - provide the secure key verification API for the pkey interface

 - introduce the z13 cpu counters to perf

 - add a new system call to set up the guarded storage facility

 - simplify TASK_SIZE and arch_get_unmapped_area

 - export the raw STSI data related to CPU topology to user space

 - ... and the usual churn of bug-fixes and cleanups.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (74 commits)
  s390/crypt: use the correct module alias for paes_s390.
  s390/cpacf: Introduce kma instruction
  s390/cpacf: query instructions use unique parameters for compatibility with KMA
  s390/trng: Introduce s390 TRNG device driver.
  s390/crypto: Provide s390 specific arch random functionality.
  s390/crypto: Add new subfunctions to the cpacf PRNO function.
  s390/crypto: Renaming PPNO to PRNO.
  s390/pageattr: avoid unnecessary page table splitting
  s390/mm: simplify arch_get_unmapped_area[_topdown]
  s390/mm: make TASK_SIZE independent from the number of page table levels
  s390/gs: add regset for the guarded storage broadcast control block
  s390/kvm: Add use_cmma field to mm_context_t
  s390/kvm: Add PGSTE manipulation functions
  vfio: ccw: improve error handling for vfio_ccw_mdev_remove
  vfio: ccw: remove unnecessary NULL checks of a pointer
  s390/spinlock: remove compare and delay instruction
  s390/spinlock: use atomic primitives for spinlocks
  s390/cpumf: simplify detection of guest samples
  s390/pci: remove forward declaration
  s390/pci: increase the PCI_NR_FUNCTIONS default
  ...
2017-05-02 09:50:09 -07:00
Martin Schwidefsky
b13de4b7ad s390/spinlock: remove compare and delay instruction
The CAD instruction never worked quite as expected for the spinlock
code. It has been disabled by default with git commit 61b0b01686,
if the "cad" kernel parameter is specified it is enabled for both user
space and the spinlock code. Leave the option to enable the instruction
for user space but remove it from the spinlock code.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-04-12 08:43:33 +02:00
Martin Schwidefsky
02c503ff23 s390/spinlock: use atomic primitives for spinlocks
Add a couple more __atomic_xxx function to atomic_ops.h and use them
to replace the compare-and-swap inlines in the spinlock code. This
changes the type of the lock value from unsigned int to int.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-04-12 08:43:33 +02:00
Al Viro
37096003c8 s390: get rid of zeroing, switch to RAW_COPY_USER
[folded a fix from Martin]
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2017-03-30 10:47:28 -04:00
Christian Borntraeger
187b5f41b4 s390: replace ACCESS_ONCE with READ_ONCE
Remove the last places of ACCESS_ONCE in s390 code.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-17 07:40:46 +01:00
Paul Gortmaker
d321796753 s390: Audit and remove any remaining unnecessary uses of module.h
Historically a lot of these existed because we did not have
a distinction between what was modular code and what was providing
support to modules via EXPORT_SYMBOL and friends.  That changed
when we forked out support for the latter into the export.h file.

This means we should be able to reduce the usage of module.h
in code that is obj-y Makefile or bool Kconfig.  The advantage
in doing so is that module.h itself sources about 15 other headers;
adding significantly to what we feed cpp, and it can obscure what
headers we are effectively using.

Since module.h was the source for init.h (for __init) and for
export.h (for EXPORT_SYMBOL) we consider each change instance
for the presence of either and replace as needed.  An instance
where module_param was used without moduleparam.h was also fixed,
as well as implicit use of ptrace.h and string.h headers.

Signed-off-by: Paul Gortmaker <paul.gortmaker@windriver.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-02-17 07:40:41 +01:00
Heiko Carstens
551f413434 s390/lib: improve memmove, memset and memcpy
Improve the memmove implementation to save one instruction and use
better label names. Also use better label names for the memset and
memcpy implementations so everything looks consistent.

Suggested-by: Jens Remus <jremus@linux.vnet.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2017-01-16 07:27:51 +01:00
Heiko Carstens
7a71fd1c59 s390/lib: add missing memory barriers to string inline assemblies
We have a couple of inline assemblies like memchr() and strlen() that
read from memory, but tell the compiler only they need the addresses
of the strings they access.
This allows the compiler to omit the initialization of such strings
and therefore generate broken code. Add the missing memory barrier to
all string related inline assemblies to fix this potential issue. It
looks like the compiler currently does not generate broken code due to
these bugs.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-14 16:33:41 +01:00
Linus Torvalds
2ec4584eb8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 updates from Martin Schwidefsky:
 "The main bulk of the s390 patches for the 4.10 merge window:

   - Add support for the contiguous memory allocator.

   - The recovery for I/O errors in the dasd device driver is improved,
     the driver will now remove channel paths that are not working
     properly.

   - Additional fields are added to /proc/sysinfo, the extended
     partition name and the partition UUID.

   - New naming for PCI devices with system defined UIDs.

   - The last few remaining alloc_bootmem calls are converted to
     memblock.

   - The thread_info structure is stripped down and moved to the
     task_struct. The only field left in thread_info is the flags field.

   - Rework of the arch topology code to fix a fake numa issue.

   - Refactoring of the atomic primitives and add a new preempt_count
     implementation.

   - Clocksource steering for the STP sync check offsets.

   - The s390 specific headers are changed to make them usable with
     CLANG.

   - Bug fixes and cleanup"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux: (70 commits)
  s390/cpumf: Use configuration level indication for sampling data
  s390: provide memmove implementation
  s390: cleanup arch/s390/kernel Makefile
  s390: fix initrd corruptions with gcov/kcov instrumented kernels
  s390: exclude early C code from gcov profiling
  s390/dasd: channel path aware error recovery
  s390/dasd: extend dasd path handling
  s390: remove unused labels from entry.S
  s390/vmlogrdr: fix IUCV buffer allocation
  s390/crypto: unlock on error in prng_tdes_read()
  s390/sysinfo: show partition extended name and UUID if available
  s390/numa: pin all possible cpus to nodes early
  s390/numa: establish cpu to node mapping early
  s390/topology: use cpu_topology array instead of per cpu variable
  s390/smp: initialize cpu_present_mask in setup_arch
  s390/topology: always use s390 specific sched_domain_topology_level
  s390/smp: use smp_get_base_cpu() helper function
  s390/numa: always use logical cpu and core ids
  s390: Remove VLAIS in ptff() and clear_table()
  s390: fix machine check panic stack switch
  ...
2016-12-13 16:33:33 -08:00
Heiko Carstens
b4623d4e5b s390: provide memmove implementation
Provide an s390 specific memmove implementation which is faster than
the generic implementation which copies byte-wise.

For non-destructive (as defined by the mvc instruction) memmove
operations the following table compares the old default implementation
versus the new s390 specific implementation:

size     old   new
   1     1ns   8ns
   2     2ns   8ns
   4     4ns   8ns
   8     7ns   8ns
  16    17ns   8ns
  32    35ns   8ns
  64    65ns   9ns
 128   146ns  10ns
 256   298ns  11ns
 512   537ns  11ns
1024  1193ns  19ns
2048  2405ns  36ns

So only for very small sizes the old implementation is faster. For
overlapping memmoves, where the mvc instruction can't be used, the new
implementation is as slow as the old one.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-12-12 12:11:32 +01:00
Christian Borntraeger
760928c0da locking/spinlocks, s390: Implement vcpu_is_preempted(cpu)
This implements the s390 version for vcpu_is_preempted(cpu),
by reworking the existing smp_vcpu_scheduled() function into
arch_vcpu_is_preempted().

We can then also get rid of the local cpu_is_preempted()
function by moving the CIF_ENABLED_WAIT test into
arch_vcpu_is_preempted().

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: David.Laight@ACULAB.COM
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: benh@kernel.crashing.org
Cc: boqun.feng@gmail.com
Cc: bsingharora@gmail.com
Cc: dave@stgolabs.net
Cc: jgross@suse.com
Cc: kernellwp@gmail.com
Cc: konrad.wilk@oracle.com
Cc: linuxppc-dev@lists.ozlabs.org
Cc: mpe@ellerman.id.au
Cc: paulmck@linux.vnet.ibm.com
Cc: paulus@samba.org
Cc: pbonzini@redhat.com
Cc: rkrcmar@redhat.com
Cc: virtualization@lists.linux-foundation.org
Cc: will.deacon@arm.com
Cc: xen-devel-request@lists.xenproject.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1478077718-37424-6-git-send-email-xinhui.pan@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2016-11-22 12:48:06 +01:00
Linus Torvalds
84d69848c9 Merge branch 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild
Pull kbuild updates from Michal Marek:

 - EXPORT_SYMBOL for asm source by Al Viro.

   This does bring a regression, because genksyms no longer generates
   checksums for these symbols (CONFIG_MODVERSIONS). Nick Piggin is
   working on a patch to fix this.

   Plus, we are talking about functions like strcpy(), which rarely
   change prototypes.

 - Fixes for PPC fallout of the above by Stephen Rothwell and Nick
   Piggin

 - fixdep speedup by Alexey Dobriyan.

 - preparatory work by Nick Piggin to allow architectures to build with
   -ffunction-sections, -fdata-sections and --gc-sections

 - CONFIG_THIN_ARCHIVES support by Stephen Rothwell

 - fix for filenames with colons in the initramfs source by me.

* 'kbuild' of git://git.kernel.org/pub/scm/linux/kernel/git/mmarek/kbuild: (22 commits)
  initramfs: Escape colons in depfile
  ppc: there is no clear_pages to export
  powerpc/64: whitelist unresolved modversions CRCs
  kbuild: -ffunction-sections fix for archs with conflicting sections
  kbuild: add arch specific post-link Makefile
  kbuild: allow archs to select link dead code/data elimination
  kbuild: allow architectures to use thin archives instead of ld -r
  kbuild: Regenerate genksyms lexer
  kbuild: genksyms fix for typeof handling
  fixdep: faster CONFIG_ search
  ia64: move exports to definitions
  sparc32: debride memcpy.S a bit
  [sparc] unify 32bit and 64bit string.h
  sparc: move exports to definitions
  ppc: move exports to definitions
  arm: move exports to definitions
  s390: move exports to definitions
  m68k: move exports to definitions
  alpha: move exports to actual definitions
  x86: move exports to actual definitions
  ...
2016-10-14 14:26:58 -07:00
Linus Torvalds
45b6ae761e Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux
Pull s390 fixes from Martin Schwidefsky:
 "A couple of bug fixes, minor cleanup and a change to the default
  config"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/s390/linux:
  s390/dasd: fix failing CUIR assignment under LPAR
  s390/pageattr: handle numpages parameter correctly
  s390/dasd: fix hanging device after clear subchannel
  s390/qdio: avoid reschedule of outbound tasklet once killed
  s390/qdio: remove checks for ccw device internal state
  s390/qdio: fix double return code evaluation
  s390/qdio: get rid of spin_lock_irqsave usage
  s390/cio: remove subchannel_id from ccw_device_private
  s390/qdio: obtain subchannel_id via ccw_device_get_schid()
  s390/cio: stop using subchannel_id from ccw_device_private
  s390/config: make the vector optimized crc function builtin
  s390/lib: fix memcmp and strstr
  s390/crc32-vx: Fix checksum calculation for small sizes
  s390: clarify compressed image code path
2016-08-16 15:50:22 -07:00
Linus Torvalds
1eccfa090e Implements HARDENED_USERCOPY verification of copy_to_user/copy_from_user
bounds checking for most architectures on SLAB and SLUB.
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 Comment: Kees Cook <kees@outflux.net>
 
 iQIcBAABCgAGBQJXl9tlAAoJEIly9N/cbcAm5BoP/ikTtDp2bFw1sn92yHTnIWzl
 O+dcKVAeRgjfnSvPfb1JITpaM58exQSaDsPBeR0DbVzU1zDdhLcwHHiQupFh98Ka
 vBZthbrlL/u4NB26enEEW0iyA32BsxYBMnIu0z5ux9RbZflmQwGQ0c0rvy3dJ7/b
 FzB5ayVST5y/a0m6/sImeeExh78GU9rsMb1XmJRMwlJAy6miDz/F9TP0LnuW6PhG
 J5XC99ygNJS1pQBLACRsrZw6ImgBxXnWCok6tWPMxFfD+rJBU2//wqS+HozyMWHL
 iYP7+ytVo/ZVok4114X/V4Oof3a6wqgpBuYrivJ228QO+UsLYbYLo6sZ8kRK7VFm
 9GgHo/8rWB1T9lBbSaa7UL5r0dVNNLjFGS42vwV+YlgUMQ1A35VRojO0jUnJSIQU
 Ug1IxKmylLd0nEcwD8/l3DXeQABsfL8GsoKW0OtdTZtW4RND4gzq34LK6t7hvayF
 kUkLg1OLNdUJwOi16M/rhugwYFZIMfoxQtjkRXKWN4RZ2QgSHnx2lhqNmRGPAXBG
 uy21wlzUTfLTqTpoeOyHzJwyF2qf2y4nsziBMhvmlrUvIzW1LIrYUKCNT4HR8Sh5
 lC2WMGYuIqaiu+NOF3v6CgvKd9UW+mxMRyPEybH8mEgfm+FLZlWABiBjIUpSEZuB
 JFfuMv1zlljj/okIQRg8
 =USIR
 -----END PGP SIGNATURE-----

Merge tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux

Pull usercopy protection from Kees Cook:
 "Tbhis implements HARDENED_USERCOPY verification of copy_to_user and
  copy_from_user bounds checking for most architectures on SLAB and
  SLUB"

* tag 'usercopy-v4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  mm: SLUB hardened usercopy support
  mm: SLAB hardened usercopy support
  s390/uaccess: Enable hardened usercopy
  sparc/uaccess: Enable hardened usercopy
  powerpc/uaccess: Enable hardened usercopy
  ia64/uaccess: Enable hardened usercopy
  arm64/uaccess: Enable hardened usercopy
  ARM: uaccess: Enable hardened usercopy
  x86/uaccess: Enable hardened usercopy
  mm: Hardened usercopy
  mm: Implement stack frame object validation
  mm: Add is_migrate_cma_page
2016-08-08 14:48:14 -07:00
Christian Borntraeger
e2efc42454 s390/lib: fix memcmp and strstr
if two string compare equal the clcle instruction will update the
string addresses to point _after_ the string. This might already
be on a different page, so we should not use these pointer to
calculate the difference as in that case the calculation of the
difference can cause oopses.

The return value of memcmp does not need the difference, we
can just reuse the condition code and return for CC=1 (All bytes
compared, first operand low) -1 and for CC=2 (All bytes compared,
first operand high) +1
strstr also does not need the diff.
While fixing this, make the common function clcle "correct on its
own" by using l1 instead of l2 for the first length. strstr will
call this with l2 for both strings.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Fixes: db7f5eef3d ("s390/lib: use basic blocks for inline assemblies")
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-08-08 15:41:31 +02:00
Al Viro
711f5df7bf s390: move exports to definitions
Acked-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
2016-08-07 23:47:20 -04:00
Kees Cook
97433ea4fd s390/uaccess: Enable hardened usercopy
Enables CONFIG_HARDENED_USERCOPY checks on s390.

Signed-off-by: Kees Cook <keescook@chromium.org>
2016-07-26 14:41:52 -07:00
Heiko Carstens
db7f5eef3d s390/lib: use basic blocks for inline assemblies
Use only simple inline assemblies which consist of a single basic
block if the register asm construct is being used.

Otherwise gcc would generate broken code if the compiler option
--sanitize-coverage=trace-pc would be used.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-06-28 09:32:28 +02:00
Heiko Carstens
b8ac5e2f4d s390/uaccess: fix whitespace damage
Fix some whitespace damage that was introduced by me with a
query-replace when removing 31 bit support.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-06-13 15:58:25 +02:00
Heiko Carstens
8497695243 s390/spinlock: avoid yield to non existent cpu
arch_spin_lock_wait_flags() checks if a spinlock is not held before
trying a compare and swap instruction. If the lock is unlocked it
tries the compare and swap instruction, however if a different cpu
grabbed the lock in the meantime the instruction will fail as
expected.

Subsequently the arch_spin_lock_wait_flags() incorrectly tries to
figure out if the cpu that holds the lock is running. However it is
using the wrong cpu number for this (-1) and then will also yield the
current cpu to the wrong cpu.

Fix this by adding a missing continue statement.

Fixes: 470ada6b1a ("s390/spinlock: refactor arch_spin_lock_wait[_flags]")
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Acked-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-04-15 18:01:48 +02:00
Martin Schwidefsky
2cfc5f9ce7 s390/xor: optimized xor routing using the XC instruction
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2016-02-23 08:56:17 +01:00
Martin Schwidefsky
419123f900 s390/spinlock: do not yield to a CPU in udelay/mdelay
It does not make sense to try to relinquish the time slice with diag 0x9c
to a CPU in a state that does not allow to schedule the CPU. The scenario
where this can happen is a CPU waiting in udelay/mdelay while holding a
spin-lock.

Add a CIF bit to tag a CPU in enabled wait and use it to detect that the
yield of a CPU will not be successful and skip the diagnose call.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-27 09:24:18 +01:00
Martin Schwidefsky
db1c45154a s390/spinlock: avoid diagnose loop
The spinlock implementation calls the diagnose 0x9c / 0x44 immediately
if the SIGP sense running reported the target CPU as not running.

The diagnose 0x9c is a hint to the hypervisor to schedule the target
CPU in preference to the source CPU that issued the diagnose. It can
happen that on return from the diagnose the target CPU has not been
scheduled yet, e.g. if the target logical CPU is on another physical
CPU and the hypervisor did not want to migrate the logical CPU.

Avoid the immediate repeat of the diagnose instruction, instead do
the retry loop before the next invocation of diagnose 0x9c.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-11-27 09:24:15 +01:00
Heiko Carstens
48002bd5af s390/bitops: remove 31 bit related comments
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-14 14:32:15 +02:00
Heiko Carstens
db7e007fd6 s390/udelay: make udelay have busy loop semantics
When using systemtap it was observed that our udelay implementation is
rather suboptimal if being called from a kprobe handler installed by
systemtap.

The problem observed when a kprobe was installed on lock_acquired().
When the probe was hit the kprobe handler did call udelay, which set
up an (internal) timer and reenabled interrupts (only the clock comparator
interrupt) and waited for the interrupt.
This is an optimization to avoid that the cpu is busy looping while waiting
that enough time passes. The problem is that the interrupt handler still
does call irq_enter()/irq_exit() which then again can lead to a deadlock,
since some accounting functions may take locks as well.

If one of these locks is the same, which caused lock_acquired() to be
called, we have a nice deadlock.

This patch reworks the udelay code for the interrupts disabled case to
immediately leave the low level interrupt handler when the clock
comparator interrupt happens. That way no C code is being called and the
deadlock cannot happen anymore.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-14 14:32:13 +02:00
Christian Borntraeger
e0af21c56d s390/spinlock: use correct barriers
_raw_write_lock_wait first sets the high order bit to indicate a
pending writer and then waits for the reader to drop to zero.
smp_rmb by definition only orders reads against reads. Let's use
a full smp_mb instead. As right now smp_rmb is implemented
as full serialization, this needs no stable backport, but this
patch will be necessary if we reimplement smp_rmb.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-10-14 14:32:00 +02:00
Linus Torvalds
ca520cab25 Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull locking and atomic updates from Ingo Molnar:
 "Main changes in this cycle are:

   - Extend atomic primitives with coherent logic op primitives
     (atomic_{or,and,xor}()) and deprecate the old partial APIs
     (atomic_{set,clear}_mask())

     The old ops were incoherent with incompatible signatures across
     architectures and with incomplete support.  Now every architecture
     supports the primitives consistently (by Peter Zijlstra)

   - Generic support for 'relaxed atomics':

       - _acquire/release/relaxed() flavours of xchg(), cmpxchg() and {add,sub}_return()
       - atomic_read_acquire()
       - atomic_set_release()

     This came out of porting qwrlock code to arm64 (by Will Deacon)

   - Clean up the fragile static_key APIs that were causing repeat bugs,
     by introducing a new one:

       DEFINE_STATIC_KEY_TRUE(name);
       DEFINE_STATIC_KEY_FALSE(name);

     which define a key of different types with an initial true/false
     value.

     Then allow:

       static_branch_likely()
       static_branch_unlikely()

     to take a key of either type and emit the right instruction for the
     case.  To be able to know the 'type' of the static key we encode it
     in the jump entry (by Peter Zijlstra)

   - Static key self-tests (by Jason Baron)

   - qrwlock optimizations (by Waiman Long)

   - small futex enhancements (by Davidlohr Bueso)

   - ... and misc other changes"

* 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (63 commits)
  jump_label/x86: Work around asm build bug on older/backported GCCs
  locking, ARM, atomics: Define our SMP atomics in terms of _relaxed() operations
  locking, include/llist: Use linux/atomic.h instead of asm/cmpxchg.h
  locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics
  locking/qrwlock: Implement queue_write_unlock() using smp_store_release()
  locking/lockref: Remove homebrew cmpxchg64_relaxed() macro definition
  locking, asm-generic: Add _{relaxed|acquire|release}() variants for 'atomic_long_t'
  locking, asm-generic: Rework atomic-long.h to avoid bulk code duplication
  locking/atomics: Add _{acquire|release|relaxed}() variants of some atomic operations
  locking, compiler.h: Cast away attributes in the WRITE_ONCE() magic
  locking/static_keys: Make verify_keys() static
  jump label, locking/static_keys: Update docs
  locking/static_keys: Provide a selftest
  jump_label: Provide a self-test
  s390/uaccess, locking/static_keys: employ static_branch_likely()
  x86, tsc, locking/static_keys: Employ static_branch_likely()
  locking/static_keys: Add selftest
  locking/static_keys: Add a new static_key interface
  locking/static_keys: Rework update logic
  locking/static_keys: Add static_key_{en,dis}able() helpers
  ...
2015-09-03 15:46:07 -07:00
Heiko Carstens
cabc4abe8e s390/uaccess: remove uaccess_primary kernel parameter
get_user() and put_user() are inline functions in the meantime
again. Both will generate the mvcos instruction if compiled
with -march=z10 (or greater).

The kernel parameter "uaccess_primary" can only change the behavior
of out-of-line uaccess functions like copy_from_user() to not use
the mvcos instruction, but not for the above named inlined functions.

Therefore it is quite useless and the parameter can be removed.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-08-19 10:39:54 +02:00
Guenter Roeck
854508c0d0 s390/lib: export __delay
__delay is exported by most architectures, and may be used in modules.
Since it is not exported for s390, s390:allmodconfig currently fails
to build with

ERROR: "__delay" [drivers/net/phy/mdio-octeon.ko] undefined!

Fixes: a6d6786452 ("net: mdio-octeon: Modify driver to work on both
	ThunderX and Octeon")
Cc: Radha Mohan Chintakuntla <rchintakuntla@cavium.com>
Cc: David Daney <david.daney@cavium.com>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-08-07 09:57:37 +02:00
Heiko Carstens
ed79e94673 s390/uaccess, locking/static_keys: employ static_branch_likely()
Use the new static_branch_likely() primitive to make sure that the
most likely case is executed without taking an unconditional branch.
This wasn't possible with the old jump label primitives.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: linux-kernel@vger.kernel.org
Link: http://lkml.kernel.org/r/20150729064600.GB3953@osiris
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-08-03 11:34:17 +02:00
Heiko Carstens
304987e365 s390: remove "64" suffix from mem64.S and swsusp_asm64.S
Rename two more files which I forgot. Also remove the "asm" from the
swsusp_asm64.S file, since the ".S" suffix already makes it obvious
that this file contains assembler code.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-03-25 11:49:51 +01:00
Heiko Carstens
5a79859ae0 s390: remove 31 bit support
Remove the 31 bit support in order to reduce maintenance cost and
effectively remove dead code. Since a couple of years there is no
distribution left that comes with a 31 bit kernel.

The 31 bit kernel also has been broken since more than a year before
anybody noticed. In addition I added a removal warning to the kernel
shown at ipl for 5 minutes: a960062e58 ("s390: add 31 bit warning
message") which let everybody know about the plan to remove 31 bit
code. We didn't get any response.

Given that the last 31 bit only machine was introduced in 1999 let's
remove the code.
Anybody with 31 bit user space code can still use the compat mode.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-03-25 11:49:33 +01:00
Martin Schwidefsky
2c72a44ecd s390/spinlock: add compare-and-delay to lock wait loops
Add the compare-and-delay instruction to the spin-lock and rw-lock
retry loops. A CPU executing the compare-and-delay instruction stops
until the lock value has changed. This is done to make the locking
code for contended locks to behave better in regard to the multi-
hreading facility. A thread of a core executing a compare-and-delay
will allow the other threads of a core to get a larger share of the
core resources.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2015-01-23 15:17:04 +01:00
Jan Willeke
d6fe5be34c s390/uprobes: fix kprobes dependency
If kprobes is disabled uprobes will not compile.
Fix this by including the correct header files.

Signed-off-by: Jan Willeke <willeke@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-10-17 14:45:51 +02:00
Martin Schwidefsky
b5f87f15e2 s390/idle: consolidate idle functions and definitions
Move the C functions and definitions related to the idle state handling
to arch/s390/include/asm/idle.h and arch/s390/kernel/idle.c. The function
s390_get_idle_time is renamed to arch_cpu_idle_time and vtime_stop_cpu to
enabled_wait.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-10-09 09:14:03 +02:00
Jan Willeke
975fab1739 s390/uprobes: common library for kprobes and uprobes
This patch moves common functions from kprobes.c to probes.c.
Thus its possible for uprobes to use them without enabling kprobes.

Signed-off-by: Jan Willeke <willeke@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:14 +02:00
Martin Schwidefsky
bbae71bf9c s390/rwlock: use the interlocked-access facility 1 instructions
Make use of the load-and-add, load-and-or and load-and-and instructions
to atomically update the read-write lock without a compare-and-swap loop.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:13 +02:00
Martin Schwidefsky
94232a4332 s390/rwlock: improve writer fairness
Set the write-lock bit in the out-of-line rwlock code to indicate that
a writer is waiting. Additional readers will no be able to get the lock
until at least one writer got the lock. Additional writers have to wait
for the first writer to release the lock again.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:12 +02:00
Martin Schwidefsky
2684e73a86 s390/rwlock: remove interrupt-enabling rwlock variant.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:10 +02:00
Martin Schwidefsky
d59b93da5e s390/rwlock: use directed yield for write-locked rwlocks
Add an owner field to the arch_rwlock_t to be able to pass the timeslice
of a virtual CPU with diagnose 0x9c to the lock owner in case the rwlock
is write-locked. The undirected yield in case the rwlock is acquired
writable but the lock is read-locked is removed.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-09-25 10:52:05 +02:00
Martin Schwidefsky
470ada6b1a s390/spinlock: refactor arch_spin_lock_wait[_flags]
Reorder the spinlock wait code to make it more readable.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-05-20 08:58:55 +02:00
Martin Schwidefsky
939c5ae402 s390/rwlock: add missing local_irq_restore calls
The out of line _raw_read_lock_wait_flags/_raw_write_lock_wait_flags
functions for the arch_read_lock_flags/arch_write_lock_flags  calls
fail to re-enable the interrupts after another unsuccessful try to
get the lock with compare-and-swap. The following wait would be
done with interrupts disabled which is suboptimal.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-05-20 08:58:54 +02:00
Martin Schwidefsky
bae8f56734 s390/spinlock,rwlock: always to a load-and-test first
In case a lock is contended it is better to do a load-and-test first
before trying to get the lock with compare-and-swap. This helps to avoid
unnecessary cache invalidations of the cacheline for the lock if the
CPU has to wait for the lock. For an uncontended lock doing the
compare-and-swap directly is a bit better, if the CPU does not have the
cacheline in its cache yet the compare-and-swap will get it read-write
immediately while a load-and-test would get it read-only first.

Always to the load-and-test first to avoid the cacheline invalidations
for the contended case outweight the potential read-only to read-write
cacheline upgrade for the uncontended case.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-05-20 08:58:53 +02:00
Gerald Schaefer
2e4006b34d s390/spinlock: fix system hang with spin_retry <= 0
On LPAR, when spin_retry is set to <= 0, arch_spin_lock_wait() and
arch_spin_lock_wait_flags() may end up in a while(1) loop w/o doing
any compare and swap operation. To fix this, use do/while instead of
for loop.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-05-20 08:58:52 +02:00
Martin Schwidefsky
beef560b4c s390/uaccess: simplify control register updates
Always switch to the kernel ASCE in switch_mm. Load the secondary
space ASCE in finish_arch_post_lock_switch after checking that
any pending page table operations have completed. The primary
ASCE is loaded in entry[64].S. With this the update_primary_asce
call can be removed from the switch_to macro and from the start
of switch_mm function. Remove the load_primary argument from
update_user_asce/clear_user_asce, rename update_user_asce to
set_user_asce and rename update_primary_asce to load_kernel_asce.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-05-20 08:58:46 +02:00
Philipp Hachtmann
6c8cd5bbda s390/spinlock: optimize spinlock code sequence
Use lowcore constant to improve the code generated for spinlocks.

[ Martin Schwidefsky: patch breakdown and code beautification ]

Signed-off-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-05-20 08:58:42 +02:00
Philipp Hachtmann
5b3f683e69 s390/spinlock: cleanup spinlock code
Improve the spinlock code in several aspects:
 - Have _raw_compare_and_swap return true if the operation has been
   successful instead of returning the old value.
 - Remove the "volatile" from arch_spinlock_t and arch_rwlock_t
 - Rename 'owner_cpu' to 'lock'
 - Add helper functions arch_spin_trylock_once / arch_spin_tryrelease_once

[ Martin Schwidefsky: patch breakdown and code beautification ]

Signed-off-by: Philipp Hachtmann <phacht@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-05-20 08:58:41 +02:00
Heiko Carstens
fa255f51c9 s390/uaccess: fix possible register corruption in strnlen_user_srst()
The whole point of the out-of-line strnlen_user_srst() function was to
avoid corruption of register 0 due to register asm assignment.
However 'somebody' :) forgot to remove the update_primary_asce() function
call, which may clobber register 0 contents.
So let's remove that call and also move the size check to the calling
function.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-04-11 13:53:33 +02:00
Heiko Carstens
457f218095 s390/uaccess: rework uaccess code - fix locking issues
The current uaccess code uses a page table walk in some circumstances,
e.g. in case of the in atomic futex operations or if running on old
hardware which doesn't support the mvcos instruction.

However it turned out that the page table walk code does not correctly
lock page tables when accessing page table entries.
In other words: a different cpu may invalidate a page table entry while
the current cpu inspects the pte. This may lead to random data corruption.

Adding correct locking however isn't trivial for all uaccess operations.
Especially copy_in_user() is problematic since that requires to hold at
least two locks, but must be protected against ABBA deadlock when a
different cpu also performs a copy_in_user() operation.

So the solution is a different approach where we change address spaces:

User space runs in primary address mode, or access register mode within
vdso code, like it currently already does.

The kernel usually also runs in home space mode, however when accessing
user space the kernel switches to primary or secondary address mode if
the mvcos instruction is not available or if a compare-and-swap (futex)
instruction on a user space address is performed.
KVM however is special, since that requires the kernel to run in home
address space while implicitly accessing user space with the sie
instruction.

So we end up with:

User space:
- runs in primary or access register mode
- cr1 contains the user asce
- cr7 contains the user asce
- cr13 contains the kernel asce

Kernel space:
- runs in home space mode
- cr1 contains the user or kernel asce
  -> the kernel asce is loaded when a uaccess requires primary or
     secondary address mode
- cr7 contains the user or kernel asce, (changed with set_fs())
- cr13 contains the kernel asce

In case of uaccess the kernel changes to:
- primary space mode in case of a uaccess (copy_to_user) and uses
  e.g. the mvcp instruction to access user space. However the kernel
  will stay in home space mode if the mvcos instruction is available
- secondary space mode in case of futex atomic operations, so that the
  instructions come from primary address space and data from secondary
  space

In case of kvm the kernel runs in home space mode, but cr1 gets switched
to contain the gmap asce before the sie instruction gets executed. When
the sie instruction is finished cr1 will be switched back to contain the
user asce.

A context switch between two processes will always load the kernel asce
for the next process in cr1. So the first exit to user space is a bit
more expensive (one extra load control register instruction) than before,
however keeps the code rather simple.

In sum this means there is no need to perform any error prone page table
walks anymore when accessing user space.

The patch seems to be rather large, however it mainly removes the
the page table walk code and restores the previously deleted "standard"
uaccess code, with a couple of changes.

The uaccess without mvcos mode can be enforced with the "uaccess_primary"
kernel parameter.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-04-03 14:31:04 +02:00
Heiko Carstens
db85eaeb52 s390/bitops: fix comment
Fix some numbers in the comments describing the layout of the bit maps.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:20 +01:00
Heiko Carstens
56f15e518c s390/uaccess: introduce 'uaccesspt' kernel parameter
The uaccesspt kernel parameter allows to enforce using the uaccess page
table walk variant. This is mainly for debugging purposes, so this mode
can also be enabled on machines which support the mvcos instruction.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:17 +01:00
Heiko Carstens
ca04ddbf53 s390/setup: get rid of MACHINE_HAS_MVCOS machine flag
MACHINE_HAS_MVCOS is used exactly once when the machine is brought up.
There is no need to cache the flag in the machine_flags.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:15 +01:00
Heiko Carstens
211deca6bf s390/uaccess: consistent types
The types 'size_t' and 'unsigned long' have been used randomly for the
uaccess functions. This looks rather confusing.
So let's change all functions to use unsigned long instead and get rid
of size_t in order to have a consistent interface.

The only exception is strncpy_from_user() which uses 'long' since it
may return a signed value (-EFAULT).

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:15 +01:00
Heiko Carstens
4f41c2b456 s390/uaccess: get rid of indirect function calls
There are only two uaccess variants on s390 left: the version that is used
if the mvcos instruction is available, and the page table walk variant.
So there is no need for expensive indirect function calls.

By default the mvcos variant will be called. If the mvcos instruction is not
available it will call the page table walk variant.

For minimal performance impact the "if (mvcos_is_available)" is implemented
with a jump label, which will be a six byte nop on machines with mvcos.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:14 +01:00
Heiko Carstens
cfa785e623 s390/uaccess: normalize order of parameters of indirect uaccess function calls
For some unknown reason the indirect uaccess functions on s390 implement a
different parameter order than what is usual.

e.g.:

unsigned long copy_to_user(void *to, const void *from, unsigned long n);
vs.
size_t (*copy_to_user)(size_t n, void __user * to, const void *from);

Let's get rid of this confusing parameter reordering.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-02-21 08:50:13 +01:00
Heiko Carstens
0ff2fe5236 s390/uaccess: remove dead extern declarations, make functions static
Remove some dead uaccess extern declarations and also make some functions
static, since they are only used locally.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-01-22 14:02:17 +01:00
Heiko Carstens
b03b467944 s390/uaccess: test if current->mm is set before walking page tables
If get_fs() == USER_DS we better test if current->mm is not zero before
walking page tables.
The page table walk code would try to lock mm->page_table_lock, however
if mm is zero this might crash.

Now it is arguably incorrect trying to access userspace if current->mm
is zero, however we have seen that and s390 would be the only architecture
which would crash in such a case.
So we better make the page table walk code a bit more robust and report
always a fault instead.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-01-22 14:02:16 +01:00
Hendrik Brueckner
b4a960159e s390: Fix misspellings using 'codespell' tool
Signed-off-by: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2014-01-16 16:40:13 +01:00
Heiko Carstens
71a86ef055 s390/uaccess: add missing page table walk range check
When translating a user space address, the address must be checked against
the ASCE limit of the process. If the address is larger than the maximum
address that is reachable with the ASCE, an ASCE type exception must be
generated.

The current code simply ignored the higher order bits. This resulted in an
address wrap around in user space instead of an exception in user space.

Cc: stable@vger.kernel.org # v3.9+
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-11-25 09:15:38 +01:00
Martin Schwidefsky
e258d719ff s390/uaccess: always run the kernel in home space
Simplify the uaccess code by removing the user_mode=home option.
The kernel will now always run in the home space mode.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:57 +02:00
Heiko Carstens
7d7c7b24e4 s390/bitops: rename find_first_bit_left() to find_first_bit_inv()
find_first_bit_left() and friends have nothing to do with the normal
LSB0 bit numbering for big endian machines used in Linux (least
significant bit has bit number 0).
Instead they use MSB0 bit numbering, where the most signficant bit has
bit number 0. So rename find_first_bit_left() and friends to
find_first_bit_inv(), to avoid any confusion.
Also provide inv versions of set_bit, clear_bit and test_bit.

This also removes the confusing use of e.g. set_bit() in airq.c which
uses a "be_to_le" bit number conversion, which could imply that instead
set_bit_le() could be used. But that is entirely wrong since the _le
bitops variant uses yet another bit numbering scheme.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:56 +02:00
Heiko Carstens
746479cdcb s390/bitops: use generic find bit functions / reimplement _left variant
Just like all other architectures we should use out-of-line find bit
operations, since the inline variant bloat the size of the kernel image.
And also like all other architecures we should only supply optimized
variants of the __ffs, ffs, etc. primitives.

Therefore this patch removes the inlined s390 find bit functions and uses
the generic out-of-line variants instead.

The optimization of the primitives follows with the next patch.

With this patch also the functions find_first_bit_left() and
find_next_bit_left() have been reimplemented, since logically, they are
nothing else but a find_first_bit()/find_next_bit() implementation that
use an inverted __fls() instead of __ffs().
Also the restriction that these functions only work on machines which
support the "flogr" instruction is gone now.

This reduces the size of the kernel image (defconfig, -march=z9-109)
by 144,482 bytes.
Alone the size of the function build_sched_domains() gets reduced from
7 KB to 3,5 KB.

We also git rid of unused functions like find_first_bit_le()...

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-24 17:16:55 +02:00
Martin Schwidefsky
8c071b0f19 s390/time: correct use of store clock fast
The result of the store-clock-fast (STCKF) instruction is a bit fuzzy.
It can happen that the value stored on one CPU is smaller than the value
stored on another CPU, although the order of the stores is the other
way around. This can cause deltas of get_tod_clock() values to become
negative when they should not be.

We need to be more careful with store-clock-fast, this patch partially
reverts git commit e4b7b4238e666682555461fa52eecd74652f36bb "time:
always use stckf instead of stck if available". The get_tod_clock()
function now uses the store-clock-extended (STCKE) instruction.
get_tod_clock_fast() can be used if the fuzziness of store-clock-fast
is acceptable e.g. for wait loops local to a CPU.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-10-22 09:16:40 +02:00
Martin Schwidefsky
0587d409ec s390/time: return with irqs disabled from psw_idle
Modify the psw_idle waiting logic in entry[64].S to return with
interrupts disabled. This avoids potential issues with udelay
and interrupt loops as interrupts are not reenabled after
clock comparator interrupts.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-28 09:19:23 +02:00
Martin Schwidefsky
e509861105 s390/mm: cleanup page table definitions
Improve the encoding of the different pte types and the naming of the
page, segment table and region table bits. Due to the different pte
encoding the hugetlbfs primitives need to be adapted as well. To improve
compatability with common code make the huge ptes use the encoding of
normal ptes. The conversion between the pte and pmd encoding for a huge
pte is done with set_huge_pte_at and huge_ptep_get.
Overall the code is now easier to understand.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-08-22 12:20:06 +02:00
Heiko Carstens
12d8471315 s390/uaccess: add "fallthrough" comments
Add "fallthrough" comments so nobody wonders if a break statement is missing.

Reported-by: Joe Perches <joe@perches.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-05-02 15:50:19 +02:00
Stephen Boyd
446f24d119 Kconfig: consolidate CONFIG_DEBUG_STRICT_USER_COPY_CHECKS
The help text for this config is duplicated across the x86, parisc, and
s390 Kconfig.debug files.  Arnd Bergman noted that the help text was
slightly misleading and should be fixed to state that enabling this
option isn't a problem when using pre 4.4 gcc.

To simplify the rewording, consolidate the text into lib/Kconfig.debug
and modify it there to be more explicit about when you should say N to
this config.

Also, make the text a bit more generic by stating that this option
enables compile time checks so we can cover architectures which emit
warnings vs.  ones which emit errors.  The details of how an
architecture decided to implement the checks isn't as important as the
concept of compile time checking of copy_from_user() calls.

While we're doing this, remove all the copy_from_user_overflow() code
that's duplicated many times and place it into lib/ so that any
architecture supporting this option can get the function for free.

Signed-off-by: Stephen Boyd <sboyd@codeaurora.org>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Ingo Molnar <mingo@kernel.org>
Acked-by: H. Peter Anvin <hpa@zytor.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>
Acked-by: Helge Deller <deller@gmx.de>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2013-04-30 17:04:09 -07:00
Heiko Carstens
ea81531de2 s390/uaccess: fix page table walk
When translating user space addresses to kernel addresses the follow_table()
function had two bugs:

- PROT_NONE mappings could be read accessed via the kernel mapping. That is
  e.g. putting a filename into a user page, then protecting the page with
  PROT_NONE and afterwards issuing the "open" syscall with a pointer to
  the filename would incorrectly succeed.

- when walking the page tables it used the pgd/pud/pmd/pte primitives which
  with dynamic page tables give no indication which real level of page tables
  is being walked (region2, region3, segment or page table). So in case of an
  exception the translation exception code passed to __handle_fault() is not
  necessarily correct.
  This is not really an issue since __handle_fault() doesn't evaluate the code.
  Only in case of e.g. a SIGBUS this code gets passed to user space. If user
  space can do something sane with the value is a different question though.

To fix these issues don't use any Linux primitives. Only walk the page tables
like the hardware would do it, however we leave quite some checks away since
we know that we only have full size page tables and each index is within bounds.

In theory this should fix all issues...

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-04-02 08:53:08 +02:00
Heiko Carstens
b7fef2dd72 s390/uaccess: fix clear_user_pt()
The page table walker variant of clear_user() is supposed to copy the
contents of the empty zero page to user space.
However since 238ec4ef "[S390] zero page cache synonyms" empty_zero_page
is not anymore the page itself but contains the pointer to the empty zero
pages. Therefore the page table walker variant of clear_user() copied
the address of the first empty zero page and afterwards more or less
random data to user space instead of clearing the given user space range.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-03-21 13:35:38 +01:00
Heiko Carstens
066c437359 s390/uaccess: fix kernel ds access for page table walk
When the kernel resides in home space and the mvcos instruction is not
available uaccesses for kernel ds happen via simple strnlen() or memcpy()
calls.
This however can break badly, since uaccesses in kernel space may fail as
well, especially if CONFIG_DEBUG_PAGEALLOC is turned on.

To fix this implement strnlen_kernel() and copy_in_kernel() functions
which can only be used by the page table uaccess functions. These two
functions detect invalid memory accesses and return the correct length
of processed data.. Both functions are more or less a copy of the std
variants without sacf calls.

Fixes ipl crashes on 31 bit machines as well on 64 bit machines without
mvcos. Caused by changing the default address space of the kernel being
home space.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-02-28 09:37:12 +01:00
Heiko Carstens
225cf8d69c s390/uaccess: fix strncpy_from_user string length check
The "standard" and page table walk variants of strncpy_from_user() first
check the length of the to be copied string in userspace.
The string is then copied to kernel space and the length returned to the
caller.
However userspace can modify the string at any time while the kernel
checks for the length of the string or copies the string. In result the
returned length of the string is not necessarily correct.
Fix this by copying in a loop which mimics the mvcos variant of
strncpy_from_user(), which handles this correctly.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-02-28 09:37:11 +01:00
Heiko Carstens
f45655f6a6 s390/uaccess: fix strncpy_from_user/strnlen_user zero maxlen case
If the maximum length specified for the to be accessed string for
strncpy_from_user() and strnlen_user() is zero the following incorrect
values would be returned or incorrect memory accesses would happen:

strnlen_user_std() and strnlen_user_pt() incorrectly return "1"
strncpy_from_user_pt() would incorrectly access "dst[maxlen - 1]"
strncpy_from_user_mvcos() would incorrectly return "-EFAULT"

Fix all these oddities by adding early checks.

Reviewed-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-02-28 09:37:08 +01:00
Heiko Carstens
d7b788cd06 s390/uaccess: shorten strncpy_from_user/strnlen_user
Always stay within page boundaries when copying from user within
strlen_user_mvcos()/strncpy_from_user_mvcos(). This allows to
shorten the code a bit and may prevent unnecessary faults, since
we copy quite large amounts of memory to kernel space.

Also directly call the mvcos variants of copy_from_user() to
avoid indirect branches.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-02-28 09:37:07 +01:00
Martin Schwidefsky
abf09bed3c s390/mm: implement software dirty bits
The s390 architecture is unique in respect to dirty page detection,
it uses the change bit in the per-page storage key to track page
modifications. All other architectures track dirty bits by means
of page table entries. This property of s390 has caused numerous
problems in the past, e.g. see git commit ef5d437f71
"mm: fix XFS oops due to dirty pages without buffers on s390".

To avoid future issues in regard to per-page dirty bits convert
s390 to a fault based software dirty bit detection mechanism. All
user page table entries which are marked as clean will be hardware
read-only, even if the pte is supposed to be writable. A write by
the user process will trigger a protection fault which will cause
the user pte to be marked as dirty and the hardware read-only bit
is removed.

With this change the dirty bit in the storage key is irrelevant
for Linux as a host, but the storage key is still required for
KVM guests. The effect is that page_test_and_clear_dirty and the
related code can be removed. The referenced bit in the storage
key is still used by the page_test_and_clear_young primitive to
provide page age information.

For page cache pages of mappings with mapping_cap_account_dirty
there will not be any change in behavior as the dirty bit tracking
already uses read-only ptes to control the amount of dirty pages.
Only for swap cache pages and pages of mappings without
mapping_cap_account_dirty there can be additional protection faults.
To avoid an excessive number of additional faults the mk_pte
primitive checks for PageDirty if the pgprot value allows for writes
and pre-dirties the pte. That avoids all additional faults for
tmpfs and shmem pages until these pages are added to the swap cache.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-02-14 15:55:23 +01:00
Heiko Carstens
1aae0560d1 s390/time: rename tod clock access functions
Fix name clash with some common code device drivers and add "tod"
to all tod clock access function names.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2013-02-14 15:55:10 +01:00
Gerald Schaefer
156152f84e s390/mm: use pmd_large() instead of pmd_huge()
Without CONFIG_HUGETLB_PAGE, pmd_huge() will always return 0. So
pmd_large() should be used instead in places where both transparent
huge pages and hugetlbfs pages can occur.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-10-26 16:44:23 +02:00
Heiko Carstens
535c611ddd s390/string: provide asm lib functions for memcpy and memcmp
Our memcpy and memcmp variants were implemented by calling the corresponding
gcc builtin variants.
However gcc is free to replace a call to __builtin_memcmp with a call to memcmp
which, when called, will result in an endless recursion within memcmp.
So let's provide asm variants and also fix the variants that are used for
uncompressing the kernel image.
In addition remove all other occurences of builtin function calls.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-26 15:44:50 +02:00
Gerald Schaefer
4db84d4f07 s390/mm: fix user access page-table walk code
The s390 page-table walk code, used for user copy and futex, currently
cannot handle huge pages. As far as user copy is concerned, that is
not really a problem because those functions will only be used on old
hardware that has no huge page support. But the futex code will also
use pagetable walk functions on current hardware when user space runs
in primary space mode. So, if a futex sits in a huge page, the futex
operation on it will result in a page fault loop or even data
corruption.

This patch adds the code for resolving huge page mappings in the user
access pagetable walk code on s390.

Signed-off-by: Gerald Schaefer <gerald.schaefer@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-09-17 09:44:24 +02:00
Martin Schwidefsky
27f6b41662 s390/vtimer: rework virtual timer interface
The current virtual timer interface is inherently per-cpu and hard to
use. The sole user of the interface is appldata which uses it to execute
a function after a specific amount of cputime has been used over all cpus.

Rework the virtual timer interface to hook into the cputime accounting.
This makes the interface independent from the CPU timer interrupts, and
makes the virtual timers global as opposed to per-cpu.
Overall the code is greatly simplified. The downside is that the accuracy
is not as good as the original implementation, but it is still good enough
for appldata.

Reviewed-by: Jan Glauber <jang@linux.vnet.ibm.com>
Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-07-20 11:15:08 +02:00
Heiko Carstens
a53c8fab3f s390/comments: unify copyright messages and remove file names
Remove the file name from the comment at top of many files. In most
cases the file name was wrong anyway, so it's rather pointless.

Also unify the IBM copyright statement. We did have a lot of sightly
different statements and wanted to change them one after another
whenever a file gets touched. However that never happened. Instead
people start to take the old/"wrong" statements to use as a template
for new files.
So unify all of them in one go.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
2012-07-20 11:15:04 +02:00
Heiko Carstens
f4815ac6c9 s390/headers: replace __s390x__ with CONFIG_64BIT where possible
Replace __s390x__ with CONFIG_64BIT in all places that are not exported
to userspace or guarded with #ifdef __KERNEL__.

Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-05-24 10:10:10 +02:00
Martin Schwidefsky
4c1051e37a [S390] rework idle code
Whenever the cpu loads an enabled wait PSW it will appear as idle to the
underlying host system. The code in default_idle calls vtime_stop_cpu
which does the necessary voodoo to get the cpu time accounting right.
The udelay code just loads an enabled wait PSW. To correct this rework
the vtime_stop_cpu/vtime_start_cpu logic and move the difficult parts
to entry[64].S, vtime_stop_cpu can now be called from anywhere and
vtime_start_cpu is gone. The correction of the cpu time during wakeup
from an enabled wait PSW is done with a critical section in entry[64].S.
As vtime_start_cpu is gone, s390_idle_check can be removed as well.

Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-03-11 11:59:28 -04:00
Martin Schwidefsky
8b646bd759 [S390] rework smp code
Define struct pcpu and merge some of the NR_CPUS arrays into it, including
__cpu_logical_map, current_set and smp_cpu_state. Split smp related
functions to those operating on physical cpus and the functions operating
on a logical cpu number. Make the functions for physical cpus use a
pointer to a struct pcpu. This hides the knowledge about cpu addresses in
smp.c, entry[64].S and swsusp_asm64.S, thus remove the sigp.h header.

The PSW restart mechanism is used to start secondary cpus, calling a
function on an online cpu, calling a function on the ipl cpu, and for
the nmi signal. Replace the different assembler functions with a
single function restart_int_handler. The new entry point calls a function
whose pointer is stored in the lowcore of the target cpu and it can wait
for the source cpu to stop. This covers all existing use cases.

Overall the code is now simpler and there are ~380 lines less code.

Reviewed-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
2012-03-11 11:59:28 -04:00