linux-stable/arch/mips
Lorenzo Stoakes 662df3e5c3 mm: madvise: implement lightweight guard page mechanism
Implement a new lightweight guard page feature, that is regions of
userland virtual memory that, when accessed, cause a fatal signal to
arise.

Currently users must establish PROT_NONE ranges to achieve this.

However this is very costly memory-wise - we need a VMA for each and every
one of these regions AND they become unmergeable with surrounding VMAs.

In addition repeated mmap() calls require repeated kernel context switches
and contention of the mmap lock to install these ranges, potentially also
having to unmap memory if installed over existing ranges.

The lightweight guard approach eliminates the VMA cost altogether - rather
than establishing a PROT_NONE VMA, it operates at the level of page table
entries - establishing PTE markers such that accesses to them cause a
fault followed by a SIGSGEV signal being raised.

This is achieved through the PTE marker mechanism, which we have already
extended to provide PTE_MARKER_GUARD, which we installed via the generic
page walking logic which we have extended for this purpose.

These guard ranges are established with MADV_GUARD_INSTALL.  If the range
in which they are installed contain any existing mappings, they will be
zapped, i.e.  free the range and unmap memory (thus mimicking the
behaviour of MADV_DONTNEED in this respect).

Any existing guard entries will be left untouched.  There is therefore no
nesting of guarded pages.

Guarded ranges are NOT cleared by MADV_DONTNEED nor MADV_FREE (in both
instances the memory range may be reused at which point a user would
expect guards to still be in place), but they are cleared via
MADV_GUARD_REMOVE, process teardown or unmapping of memory ranges.

The guard property can be removed from ranges via MADV_GUARD_REMOVE.  The
ranges over which this is applied, should they contain non-guard entries,
will be untouched, with only guard entries being cleared.

We permit this operation on anonymous memory only, and only VMAs which are
non-special, non-huge and not mlock()'d (if we permitted this we'd have to
drop locked pages which would be rather counterintuitive).

Racing page faults can cause repeated attempts to install guard pages that
are interrupted, result in a zap, and this process can end up being
repeated.  If this happens more than would be expected in normal
operation, we rescind locks and retry the whole thing, which avoids lock
contention in this scenario.

Link: https://lkml.kernel.org/r/6aafb5821bf209f277dfae0787abb2ef87a37542.1730123433.git.lorenzo.stoakes@oracle.com
Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Suggested-by: Jann Horn <jannh@google.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Suggested-by: Vlastimil Babka <vbabka@suse.cz>
Suggested-by: Jann Horn <jannh@google.com>
Suggested-by: David Hildenbrand <david@redhat.com>
Acked-by: Vlastimil Babka <vbabka@suse.cz>
Cc: Arnd Bergmann <arnd@kernel.org>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Christoph Hellwig <hch@infradead.org>
Cc: Chris Zankel <chris@zankel.net>
Cc: Helge Deller <deller@gmx.de>
Cc: James E.J. Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Jeff Xu <jeffxu@chromium.org>
Cc: John Hubbard <jhubbard@nvidia.com>
Cc: Liam R. Howlett <Liam.Howlett@Oracle.com>
Cc: Matthew Wilcox (Oracle) <willy@infradead.org>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Richard Henderson <richard.henderson@linaro.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Shuah Khan <skhan@linuxfoundation.org>
Cc: Sidhartha Kumar <sidhartha.kumar@oracle.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-11-11 00:26:45 -08:00
..
alchemy MIPS: Remove unused function dump_au1000_dma_channel() in dma.c 2024-08-29 10:38:18 +02:00
ath25 MIPS: ath25: Constify static irq_domain_ops 2022-02-22 09:39:03 +01:00
ath79 MIPS: ath79: remove obsolete ATH79_DEV_* configs 2023-03-17 10:28:04 +01:00
bcm47xx mips: bmips: setup: make CBR address configurable 2024-06-27 10:44:32 +02:00
bcm63xx gpiolib: legacy: Kill GPIOF_INIT_* definitions 2024-09-02 11:47:06 +02:00
bmips mips: bmips: setup: make CBR address configurable 2024-06-27 10:44:32 +02:00
boot move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
cavium-octeon Just cleanups and fixes 2024-01-17 11:20:50 -08:00
cobalt MIPS: Cobalt: Fix missing prototypes 2024-01-22 10:32:21 +01:00
configs mips: configs: enable I2C_DESIGNWARE_CORE with I2C_DESIGNWARE_PLATFORM 2024-09-10 00:36:52 +02:00
crypto move asm/unaligned.h to linux/unaligned.h 2024-10-02 17:23:23 -04:00
dec genirq: Convert kstat_irqs to a struct 2024-04-12 17:08:05 +02:00
fw MIPS: fw arc: Fix missing prototypes 2024-01-22 11:12:01 +01:00
generic mips: generic: add fdt fixup for Realtek reference board 2024-07-12 13:12:13 +02:00
include mm: madvise: implement lightweight guard page mechanism 2024-11-11 00:26:45 -08:00
ingenic MIPS: Kconfig: ingenic: Ensure MACH_INGENIC_GENERIC selects all SoCs 2021-06-01 11:44:47 +02:00
jazz mips/jazz: remove unused jazz_handle_int() declaration 2024-08-29 10:39:00 +02:00
kernel for-6.12-rc5-tag 2024-11-01 07:31:47 -10:00
kvm KVM: MIPS: Rename virtualization {en,dis}abling APIs to match common KVM 2024-09-04 11:02:33 -04:00
lantiq MIPS: lantiq: improve USB initialization 2024-07-12 13:04:24 +02:00
lib mips: implement xor_unlock_is_negative_byte 2023-10-18 14:34:17 -07:00
loongson2ef MIPS: Fix typos 2024-01-08 10:39:12 +01:00
loongson32 MIPS: loongson32: Remove dma.h and nand.h 2023-10-06 10:10:13 +02:00
loongson64 arch, mm: pull out allocation of NODE_DATA to generic code 2024-09-03 21:15:28 -07:00
math-emu MIPS: Fix comment typo 2022-09-12 15:33:24 +02:00
mm mm: make arch_get_unmapped_area() take vm_flags by default 2024-09-09 16:39:13 -07:00
mobileye MIPS: mobileye: Add EyeQ6H support 2024-06-11 10:15:50 +02:00
mti-malta vgacon: clean up global screen_info instances 2023-10-17 10:17:02 +02:00
n64 mips: Add N64 machine type 2021-01-22 11:40:00 +01:00
net bpf: Take return from set_memory_rox() into account with bpf_jit_binary_lock_ro() 2024-03-14 19:28:52 -07:00
pci MIPS: Octeron: remove source file executable bit 2024-07-09 10:38:08 +02:00
pic32 MIPS: Fixup explicit DT include clean-up 2023-07-28 11:41:09 +02:00
power mips: suspend: include linux/suspend.h as needed 2023-12-10 17:21:41 -08:00
ralink MIPS: ralink: Fix missing get_c0_perfcount_int prototype 2024-08-29 10:29:28 +02:00
rb532 MIPS: RB532: Declare prom_setup_cmdline() and rb532_gpio_init() static 2024-04-15 10:21:52 +02:00
sgi-ip22 mips: sgi-ip22: Fix the build 2024-08-13 11:34:55 +02:00
sgi-ip27 arch, mm: move definition of node_data to generic code 2024-09-03 21:15:28 -07:00
sgi-ip30 MIPS: ip30: ip30-console: Add missing include 2024-06-19 13:09:35 +02:00
sgi-ip32 MIPS: sgi-ip32: Fix missing prototypes 2024-01-22 11:12:19 +01:00
sibyte mips: sibyte: add missing MODULE_DESCRIPTION() macro 2024-07-23 09:47:40 +02:00
sni vgacon: clean up global screen_info instances 2023-10-17 10:17:02 +02:00
tools MIPS: fix typos in comments 2022-05-04 22:22:59 +02:00
txx9 mips: txx9: make txx9_sramc_subsys const 2024-02-20 13:36:34 +01:00
vdso Makefile: remove redundant tool coverage variables 2024-05-14 23:35:48 +09:00
Kbuild MIPS: Share generic kernel code with other architecture 2024-02-20 13:36:25 +01:00
Kbuild.platforms MIPS: mobileye: Add EyeQ6H support 2024-06-11 10:15:50 +02:00
Kconfig ALong with the usual shower of singleton patches, notable patch series in 2024-09-21 07:29:05 -07:00
Kconfig.debug tracing: Refactor TRACE_IRQFLAGS_SUPPORT in Kconfig 2021-08-16 11:37:21 -04:00
Makefile MIPS: Fix fallback march for SB1 2024-07-15 18:16:23 +02:00
Makefile.postlink kbuild: remove ARCH_POSTLINK from module builds 2023-10-28 21:10:08 +09:00