linux-next/arch/s390
Ryan Roberts 3a5a8d343e mm: fix race between __split_huge_pmd_locked() and GUP-fast
__split_huge_pmd_locked() can be called for a present THP, devmap or
(non-present) migration entry.  It calls pmdp_invalidate() unconditionally
on the pmdp and only determines if it is present or not based on the
returned old pmd.  This is a problem for the migration entry case because
pmd_mkinvalid(), called by pmdp_invalidate() must only be called for a
present pmd.

On arm64 at least, pmd_mkinvalid() will mark the pmd such that any future
call to pmd_present() will return true.  And therefore any lockless
pgtable walker could see the migration entry pmd in this state and start
interpretting the fields as if it were present, leading to BadThings (TM).
GUP-fast appears to be one such lockless pgtable walker.

x86 does not suffer the above problem, but instead pmd_mkinvalid() will
corrupt the offset field of the swap entry within the swap pte.  See link
below for discussion of that problem.

Fix all of this by only calling pmdp_invalidate() for a present pmd.  And
for good measure let's add a warning to all implementations of
pmdp_invalidate[_ad]().  I've manually reviewed all other
pmdp_invalidate[_ad]() call sites and believe all others to be conformant.

This is a theoretical bug found during code review.  I don't have any test
case to trigger it in practice.

Link: https://lkml.kernel.org/r/20240501143310.1381675-1-ryan.roberts@arm.com
Link: https://lore.kernel.org/all/0dd7827a-6334-439a-8fd0-43c98e6af22b@arm.com/
Fixes: 84c3fc4e9c ("mm: thp: check pmd migration entry in common path")
Signed-off-by: Ryan Roberts <ryan.roberts@arm.com>
Reviewed-by: Zi Yan <ziy@nvidia.com>
Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Aneesh Kumar K.V <aneesh.kumar@kernel.org>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "David S. Miller" <davem@davemloft.net>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.ibm.com>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-05-07 10:37:00 -07:00
..
appldata S390: Remove now superfluous sentinel elem from ctl_table arrays 2023-10-10 15:22:02 -07:00
boot - Sumanth Korikkar has taught s390 to allocate hotplug-time page frames 2024-03-14 17:43:30 -07:00
configs s390/mm: provide simple ARCH_HAS_DEBUG_VIRTUAL support 2024-03-13 09:23:49 +01:00
crypto s390/crypto: remove retry loop with sleep from PAES pkey invocation 2024-03-07 14:41:15 +01:00
hypfs s390/hypfs_sprp: remove unneeded DMA zone allocation 2024-02-09 13:58:14 +01:00
include mm: fix race between __split_huge_pmd_locked() and GUP-fast 2024-05-07 10:37:00 -07:00
kernel fix missing vmalloc.h includes 2024-04-25 20:55:49 -07:00
kvm S390: 2024-03-15 13:03:13 -07:00
lib s390/checksum: provide csum_partial_copy_nocheck() 2024-02-16 14:30:17 +01:00
mm s390: mm: accelerate pagefault when badaccess 2024-04-25 20:56:39 -07:00
net s390/bpf: Fix bpf_plt pointer arithmetic 2024-03-19 22:52:43 -07:00
pci mm: pass VMA instead of MM to follow_pte() 2024-05-05 17:53:27 -07:00
purgatory s390 updates for 6.5 merge window part 2 2023-07-06 13:18:30 -07:00
tools s390/tools: handle rela R_390_GOTPCDBL/R_390_GOTOFF64 2024-03-07 17:02:05 +01:00
Kbuild - An extensive rework of kexec and crash Kconfig from Eric DeVolder 2023-08-29 14:53:51 -07:00
Kconfig mm/treewide: rename CONFIG_HAVE_FAST_GUP to CONFIG_HAVE_GUP_FAST 2024-04-25 20:56:41 -07:00
Kconfig.debug s390/Kconfig.debug: fix indentation 2022-06-01 12:03:15 +02:00
Makefile s390/mm: provide simple ARCH_HAS_DEBUG_VIRTUAL support 2024-03-13 09:23:49 +01:00