1, Better backtraces for humanization;
2, Relay BCE exceptions to userland as SIGSEGV;
3, Provide kernel fpu functions;
4, Optimize memory ops (memset/memcpy/memmove);
5, Optimize checksum and crc32(c) calculation;
6, Add ARCH_HAS_FORTIFY_SOURCE selection;
7, Add function error injection support;
8, Add ftrace with direct call support;
9, Add basic perf tools support.
-----BEGIN PGP SIGNATURE-----
iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmRQlUsWHGNoZW5odWFj
YWlAa2VybmVsLm9yZwAKCRAChivD8uImekCTD/9fc2U+FIXhJOWV5yK9TCjJTRnK
ASvk0JMYIDA60+fnof3C85tDu9Py9M5Mvt/Ec5pBaHErn16irq85AdD74/OmyCc2
V4pRFHbYLu0WBFQN77gfNXH0XErgYXdceZvaMXajVz2H6NlSKSWZOVN/9ut5SLi3
mt0rCwCsyahj92n8+hOjjZeFbDaPfPMCQ/8n9dnadhbBm9iz35fOKY+qIBHJMJ9a
wPfZ2k3wu5DHs/2+ZjFNhlwrlURTp3RlcVQ7QWDcR1LM3Z4/lEkD8tAI/r8sR9gw
rxzoBSaQzo/zscUmYo0jh1BoW2w0n+x/GfH70Pyz3iwZky3jwpdP0nRwnB4h+tnE
wKlpa5K7RfaqUxZExFfGALmlkALtjQgiXPYbORHMsD6l6XwrOMCeyQismm1oo66m
JBlsdXCms5aracYmWhXnVmTlBqGjAgYAxm62ap62uwlmULy4qUv6kFeW0fERn9NJ
5bKgbrkcal/WkMBawQqtG03niRkykqpqFooZ95ubj4Lib4VM0BmEvFrREjgXO7AE
jpLimYsT9ROE3YQJqyWyLYkmc2ShwWj70INTpz2viMtQ2blIRKvRVsxs976bHuwS
mGsZtiiANjhT2bAUhN7bct2Cf13MtPXiuf0etcJbrNSAtoBIFk+3uRRKHH2rM+CK
oKYjO+exPyuQ9nSOBg==
=3aTV
-----END PGP SIGNATURE-----
Merge tag 'loongarch-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson
Pull LoongArch updates from Huacai Chen:
- Better backtraces for humanization
- Relay BCE exceptions to userland as SIGSEGV
- Provide kernel fpu functions
- Optimize memory ops (memset/memcpy/memmove)
- Optimize checksum and crc32(c) calculation
- Add ARCH_HAS_FORTIFY_SOURCE selection
- Add function error injection support
- Add ftrace with direct call support
- Add basic perf tools support
* tag 'loongarch-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson: (24 commits)
tools/perf: Add basic support for LoongArch
LoongArch: ftrace: Add direct call trampoline samples support
LoongArch: ftrace: Add direct call support
LoongArch: ftrace: Implement ftrace_find_callable_addr() to simplify code
LoongArch: ftrace: Fix build error if DYNAMIC_FTRACE_WITH_REGS is not set
LoongArch: ftrace: Abstract DYNAMIC_FTRACE_WITH_ARGS accesses
LoongArch: Add support for function error injection
LoongArch: Add ARCH_HAS_FORTIFY_SOURCE selection
LoongArch: crypto: Add crc32 and crc32c hw acceleration
LoongArch: Add checksum optimization for 64-bit system
LoongArch: Optimize memory ops (memset/memcpy/memmove)
LoongArch: Provide kernel fpu functions
LoongArch: Relay BCE exceptions to userland as SIGSEGV with si_code=SEGV_BNDERR
LoongArch: Tweak the BADV and CPUCFG.PRID lines in show_regs()
LoongArch: Humanize the ESTAT line when showing registers
LoongArch: Humanize the ECFG line when showing registers
LoongArch: Humanize the EUEN line when showing registers
LoongArch: Humanize the PRMD line when showing registers
LoongArch: Humanize the CRMD line when showing registers
LoongArch: Fix format of CSR lines during show_regs()
...
The ftrace samples need per-architecture trampoline implementations to
save and restore argument registers around the calls to my_direct_func*
and to restore polluted registers (e.g: ra).
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Select the HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS to provide the
register_ftrace_direct[_multi] interfaces allowing users to register
the customed trampoline (direct_caller) as the mcount for one or more
target functions. And modify_ftrace_direct[_multi] are also provided
for modifying direct_caller.
There are a few cases to distinguish:
- If a direct call ops is the only one tracing a function AND the direct
called trampoline is within the reach of a 'bl' instruction
-> the ftrace patchsite jumps to the trampoline
- Else
-> the ftrace patchsite jumps to the ftrace_regs_caller trampoline points
to ftrace_list_ops so it iterates over all registered ftrace ops,
including the direct call ops and calls its call_direct_funcs handler
which stores the direct called trampoline's address in the ftrace_regs
and the ftrace_regs_caller trampoline will return to that address
instead of returning to the traced function
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Inspired by the commit 42d038c4fb ("arm64: Add support for function
error injection") and the commit ee55ff803b ("riscv: Add support for
function error injection"), this patch supports function error injection
for LoongArch.
Mainly implement two functions:
(1) regs_set_return_value() which is used to overwrite the return value,
(2) override_function_with_return() which is used to override the probed
function returning and jump to its caller.
Here is a simple test under CONFIG_FUNCTION_ERROR_INJECTION and
CONFIG_FAIL_FUNCTION:
# echo sys_clone > /sys/kernel/debug/fail_function/inject
# echo 100 > /sys/kernel/debug/fail_function/probability
# dmesg
bash: fork: Invalid argument
# dmesg
...
FAULT_INJECTION: forcing a failure.
name fail_function, interval 1, probability 100, space 0, times 1
...
Call Trace:
[<90000000002238f4>] show_stack+0x5c/0x180
[<90000000012e384c>] dump_stack_lvl+0x60/0x88
[<9000000000b1879c>] should_fail_ex+0x1b0/0x1f4
[<900000000032ead4>] fei_kprobe_handler+0x28/0x6c
[<9000000000230970>] kprobe_breakpoint_handler+0xf0/0x118
[<90000000012e3e60>] do_bp+0x2c4/0x358
[<9000000002241924>] exception_handlers+0x1924/0x10000
[<900000000023b7d0>] sys_clone+0x0/0x4
[<90000000012e4744>] do_syscall+0x7c/0x94
[<9000000000221e44>] handle_syscall+0xc4/0x160
Tested-by: Hengqi Chen <hengqi.chen@gmail.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
FORTIFY_SOURCE could detect various overflows at compile and run time.
ARCH_HAS_FORTIFY_SOURCE means that the architecture can be built and run
with CONFIG_FORTIFY_SOURCE. So select it in LoongArch.
See more about this feature from commit 6974f0c455 ("include/linux/
string.h: add the option of fortified string.h functions").
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
switching from a user process to a kernel thread.
- More folio conversions from Kefeng Wang, Zhang Peng and Pankaj Raghav.
- zsmalloc performance improvements from Sergey Senozhatsky.
- Yue Zhao has found and fixed some data race issues around the
alteration of memcg userspace tunables.
- VFS rationalizations from Christoph Hellwig:
- removal of most of the callers of write_one_page().
- make __filemap_get_folio()'s return value more useful
- Luis Chamberlain has changed tmpfs so it no longer requires swap
backing. Use `mount -o noswap'.
- Qi Zheng has made the slab shrinkers operate locklessly, providing
some scalability benefits.
- Keith Busch has improved dmapool's performance, making part of its
operations O(1) rather than O(n).
- Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
permitting userspace to wr-protect anon memory unpopulated ptes.
- Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive rather
than exclusive, and has fixed a bunch of errors which were caused by its
unintuitive meaning.
- Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
which causes minor faults to install a write-protected pte.
- Vlastimil Babka has done some maintenance work on vma_merge():
cleanups to the kernel code and improvements to our userspace test
harness.
- Cleanups to do_fault_around() by Lorenzo Stoakes.
- Mike Rapoport has moved a lot of initialization code out of various
mm/ files and into mm/mm_init.c.
- Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
DRM, but DRM doesn't use it any more.
- Lorenzo has also coverted read_kcore() and vread() to use iterators
and has thereby removed the use of bounce buffers in some cases.
- Lorenzo has also contributed further cleanups of vma_merge().
- Chaitanya Prakash provides some fixes to the mmap selftesting code.
- Matthew Wilcox changes xfs and afs so they no longer take sleeping
locks in ->map_page(), a step towards RCUification of pagefaults.
- Suren Baghdasaryan has improved mmap_lock scalability by switching to
per-VMA locking.
- Frederic Weisbecker has reworked the percpu cache draining so that it
no longer causes latency glitches on cpu isolated workloads.
- Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
logic.
- Liu Shixin has changed zswap's initialization so we no longer waste a
chunk of memory if zswap is not being used.
- Yosry Ahmed has improved the performance of memcg statistics flushing.
- David Stevens has fixed several issues involving khugepaged,
userfaultfd and shmem.
- Christoph Hellwig has provided some cleanup work to zram's IO-related
code paths.
- David Hildenbrand has fixed up some issues in the selftest code's
testing of our pte state changing.
- Pankaj Raghav has made page_endio() unneeded and has removed it.
- Peter Xu contributed some rationalizations of the userfaultfd
selftests.
- Yosry Ahmed has fixed an issue around memcg's page recalim accounting.
- Chaitanya Prakash has fixed some arm-related issues in the
selftests/mm code.
- Longlong Xia has improved the way in which KSM handles hwpoisoned
pages.
- Peter Xu fixes a few issues with uffd-wp at fork() time.
- Stefan Roesch has changed KSM so that it may now be used on a
per-process and per-cgroup basis.
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZEr3zQAKCRDdBJ7gKXxA
jlLoAP0fpQBipwFxED0Us4SKQfupV6z4caXNJGPeay7Aj11/kQD/aMRC2uPfgr96
eMG3kwn2pqkB9ST2QpkaRbxA//eMbQY=
=J+Dj
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Nick Piggin's "shoot lazy tlbs" series, to improve the peformance of
switching from a user process to a kernel thread.
- More folio conversions from Kefeng Wang, Zhang Peng and Pankaj
Raghav.
- zsmalloc performance improvements from Sergey Senozhatsky.
- Yue Zhao has found and fixed some data race issues around the
alteration of memcg userspace tunables.
- VFS rationalizations from Christoph Hellwig:
- removal of most of the callers of write_one_page()
- make __filemap_get_folio()'s return value more useful
- Luis Chamberlain has changed tmpfs so it no longer requires swap
backing. Use `mount -o noswap'.
- Qi Zheng has made the slab shrinkers operate locklessly, providing
some scalability benefits.
- Keith Busch has improved dmapool's performance, making part of its
operations O(1) rather than O(n).
- Peter Xu adds the UFFD_FEATURE_WP_UNPOPULATED feature to userfaultd,
permitting userspace to wr-protect anon memory unpopulated ptes.
- Kirill Shutemov has changed MAX_ORDER's meaning to be inclusive
rather than exclusive, and has fixed a bunch of errors which were
caused by its unintuitive meaning.
- Axel Rasmussen give userfaultfd the UFFDIO_CONTINUE_MODE_WP feature,
which causes minor faults to install a write-protected pte.
- Vlastimil Babka has done some maintenance work on vma_merge():
cleanups to the kernel code and improvements to our userspace test
harness.
- Cleanups to do_fault_around() by Lorenzo Stoakes.
- Mike Rapoport has moved a lot of initialization code out of various
mm/ files and into mm/mm_init.c.
- Lorenzo Stoakes removd vmf_insert_mixed_prot(), which was added for
DRM, but DRM doesn't use it any more.
- Lorenzo has also coverted read_kcore() and vread() to use iterators
and has thereby removed the use of bounce buffers in some cases.
- Lorenzo has also contributed further cleanups of vma_merge().
- Chaitanya Prakash provides some fixes to the mmap selftesting code.
- Matthew Wilcox changes xfs and afs so they no longer take sleeping
locks in ->map_page(), a step towards RCUification of pagefaults.
- Suren Baghdasaryan has improved mmap_lock scalability by switching to
per-VMA locking.
- Frederic Weisbecker has reworked the percpu cache draining so that it
no longer causes latency glitches on cpu isolated workloads.
- Mike Rapoport cleans up and corrects the ARCH_FORCE_MAX_ORDER Kconfig
logic.
- Liu Shixin has changed zswap's initialization so we no longer waste a
chunk of memory if zswap is not being used.
- Yosry Ahmed has improved the performance of memcg statistics
flushing.
- David Stevens has fixed several issues involving khugepaged,
userfaultfd and shmem.
- Christoph Hellwig has provided some cleanup work to zram's IO-related
code paths.
- David Hildenbrand has fixed up some issues in the selftest code's
testing of our pte state changing.
- Pankaj Raghav has made page_endio() unneeded and has removed it.
- Peter Xu contributed some rationalizations of the userfaultfd
selftests.
- Yosry Ahmed has fixed an issue around memcg's page recalim
accounting.
- Chaitanya Prakash has fixed some arm-related issues in the
selftests/mm code.
- Longlong Xia has improved the way in which KSM handles hwpoisoned
pages.
- Peter Xu fixes a few issues with uffd-wp at fork() time.
- Stefan Roesch has changed KSM so that it may now be used on a
per-process and per-cgroup basis.
* tag 'mm-stable-2023-04-27-15-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (369 commits)
mm,unmap: avoid flushing TLB in batch if PTE is inaccessible
shmem: restrict noswap option to initial user namespace
mm/khugepaged: fix conflicting mods to collapse_file()
sparse: remove unnecessary 0 values from rc
mm: move 'mmap_min_addr' logic from callers into vm_unmapped_area()
hugetlb: pte_alloc_huge() to replace huge pte_alloc_map()
maple_tree: fix allocation in mas_sparse_area()
mm: do not increment pgfault stats when page fault handler retries
zsmalloc: allow only one active pool compaction context
selftests/mm: add new selftests for KSM
mm: add new KSM process and sysfs knobs
mm: add new api to enable ksm per process
mm: shrinkers: fix debugfs file permissions
mm: don't check VMA write permissions if the PTE/PMD indicates write permissions
migrate_pages_batch: fix statistics for longterm pin retry
userfaultfd: use helper function range_in_vma()
lib/show_mem.c: use for_each_populated_zone() simplify code
mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()
fs/buffer: convert create_page_buffers to folio_create_buffers
fs/buffer: add folio_create_empty_buffers helper
...
These are various cleanups, fixing a number of uapi header files to no
longer reference CONFIG_* symbols, and one patch that introduces the
new CONFIG_HAS_IOPORT symbol for architectures that provide working
inb()/outb() macros, as a preparation for adding driver dependencies
on those in the following release.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmRG8IkACgkQYKtH/8kJ
Uid15Q/9E/neIIEqEk6IvtyhUicrJiIZUM0rGoYtWXiz75ggk6Kx9+3I+j8zIQ/E
kf2TzAG7q9Md7nfTDFLr4FSr0IcNDj+VG4nYxUyDHdKGcARO+g9Kpdvscxip3lgU
Rw5w74Gyd30u4iUKGS39OYuxcCgl9LaFjMA9Gh402Oiaoh+OYLmgQS9h/goUD5KN
Nd+AoFvkdbnHl0/SpxthLRyL5rFEATBmAY7apYViPyMvfjS3gfDJwXJR9jkKgi6X
Qs4t8Op8BA3h84dCuo6VcFqgAJs2Wiq3nyTSUnkF8NxJ2RFTpeiVgfsLOzXHeDgz
SKDB4Lp14o3mlyZyj00MWq1uMJRRetUgNiVb6iHOoKQ/E4demBdh+mhIFRybjM5B
XNTWFcg9PWFCMa4W9jnLfZBc881X4+7T+qUF8I0W/1AbRJUmyGj8HO6jLceC4yGD
UYLn5oFPM6OWXHp6DqJrCr9Yw8h6fuviQZFEbl/ARlgVGt+J4KbYweJYk8DzfX6t
PZIj8LskOqyIpRuC2oDA1PHxkaJ1/z+N5oRBHq1uicSh4fxY5HW7HnyzgF08+R3k
cf+fjAhC3TfGusHkBwQKQJvpxrxZjPuvYXDZ0GxTvNKJRB8eMeiTm1n41E5oTVwQ
swSblSCjZj/fMVVPXLcjxEW4SBNWRxa9Lz3tIPXb3RheU10Lfy8=
=H3k4
-----END PGP SIGNATURE-----
Merge tag 'asm-generic-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic
Pull asm-generic updates from Arnd Bergmann:
"These are various cleanups, fixing a number of uapi header files to no
longer reference CONFIG_* symbols, and one patch that introduces the
new CONFIG_HAS_IOPORT symbol for architectures that provide working
inb()/outb() macros, as a preparation for adding driver dependencies
on those in the following release"
* tag 'asm-generic-6.4' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
Kconfig: introduce HAS_IOPORT option and select it as necessary
scripts: Update the CONFIG_* ignore list in headers_install.sh
pktcdvd: Remove CONFIG_CDROM_PKTCDVD_WCACHE from uapi header
Move bp_type_idx to include/linux/hw_breakpoint.h
Move ep_take_care_of_epollwakeup() to fs/eventpoll.c
Move COMPAT_ATM_ADDPARTY to net/atm/svc.c
Now we use ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP config option to
indicate devdax and hugetlb vmemmap optimization support. Hence rename
that to a generic ARCH_WANT_OPTIMIZE_VMEMMAP
Link: https://lkml.kernel.org/r/20230412050025.84346-2-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Reviewed-by: Muchun Song <songmuchun@bytedance.com>
Cc: Joao Martins <joao.m.martins@oracle.com>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Tarun Sahu <tsahu@linux.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
LoongArch maintains cache coherency in hardware, but when paired with
LS7A chipsets the WUC attribute (Weak-ordered UnCached, which is similar
to WriteCombine) is out of the scope of cache coherency machanism for
PCIe devices (this is a PCIe protocol violation, which may be fixed in
newer chipsets).
This means WUC can only used for write-only memory regions now, so this
option is disabled by default, making WUC silently fallback to SUC for
ioremap(). You can enable this option if the kernel is ensured to run on
hardware without this bug.
Kernel parameter writecombine=on/off can be used to override the Kconfig
option.
Cc: stable@vger.kernel.org
Suggested-by: WANG Xuerui <kernel@xen0n.name>
Reviewed-by: WANG Xuerui <kernel@xen0n.name>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
LoongArch defines insane ranges for ARCH_FORCE_MAX_ORDER allowing
MAX_ORDER up to 63, which implies maximal contiguous allocation size of
2^63 pages.
Drop bogus definitions of ranges for ARCH_FORCE_MAX_ORDER and leave it a
simple integer with sensible defaults.
Users that *really* need to change the value of ARCH_FORCE_MAX_ORDER will
be able to do so but they won't be mislead by the bogus ranges.
Link: https://lkml.kernel.org/r/20230322081727.2516291-1-rppt@kernel.org
Signed-off-by: Mike Rapoport (IBM) <rppt@kernel.org>
Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
We introduce a new HAS_IOPORT Kconfig option to indicate support for I/O
Port access. In a future patch HAS_IOPORT=n will disable compilation of
the I/O accessor functions inb()/outb() and friends on architectures
which can not meaningfully support legacy I/O spaces such as s390.
The following architectures do not select HAS_IOPORT:
* ARC
* C-SKY
* Hexagon
* Nios II
* OpenRISC
* s390
* User-Mode Linux
* Xtensa
All other architectures select HAS_IOPORT at least conditionally.
The "depends on" relations on HAS_IOPORT in drivers as well as ifdefs
for HAS_IOPORT specific sections will be added in subsequent patches on
a per subsystem basis.
Co-developed-by: Arnd Bergmann <arnd@kernel.org>
Signed-off-by: Arnd Bergmann <arnd@kernel.org>
Acked-by: Johannes Berg <johannes@sipsolutions.net> # for ARCH=um
Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Use the generic kretprobe trampoline handler to add kretprobes support
for LoongArch.
Tested-by: Jeff Xie <xiehuan09@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Kprobes allows you to trap at almost any kernel address and execute a
callback function, this commit adds kprobes support for LoongArch.
Tested-by: Jeff Xie <xiehuan09@gmail.com>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add regs_get_argument() which returns N th argument of the function
call, This enables ftrace kprobe events to access kernel function
arguments via $argN syntax for later use.
E.g.:
echo 'p bio_add_page arg1=$arg1' > kprobe_events
bash: echo: write error: Invalid argument
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Use perf framework to manage hardware instruction and data breakpoints.
LoongArch defines hardware watchpoint functions for instruction fetch
and memory load/store operations. After the software configures hardware
watchpoints, the processor hardware will monitor the access address of
the instruction fetch and load/store operation, and trigger an exception
of the watchpoint when it meets the conditions set by the watchpoint.
The hardware monitoring points for instruction fetching and load/store
operations each have a register for the overall configuration of all
monitoring points, a register for recording the status of all monitoring
points, and four registers required for configuration of each watchpoint
individually.
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This feature depends on the kernel being relocatable.
Enable using single kernel image for kdump, and then no longer need to
build two kernels (production kernel and capture kernel share a single
kernel image).
Also enable CONFIG_CRASH_DUMP in loongson3_defconfig.
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This patch adds support for relocating the kernel to a random address.
Entropy is derived from the banner, which will change every build and
random_get_entropy() which should provide additional runtime entropy.
The kernel is relocated by up to RANDOMIZE_BASE_MAX_OFFSET bytes from
its link address. Because relocation happens so early during the kernel
booting, the amount of physical memory has not yet been determined. This
means the only way to limit relocation within the available memory is
via Kconfig. So we limit the maximum value of RANDOMIZE_BASE_MAX_OFFSET
to 256M (0x10000000) because our memory layout has many holes.
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Xi Ruoyao <xry111@xry111.site> # Fix compiler warnings
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This config allows to compile kernel as PIE and to relocate it at any
virtual address at runtime: this paves the way to KASLR.
Runtime relocation is possible since relocation metadata are embedded
into the kernel.
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Xi Ruoyao <xry111@xry111.site> # Use arch_initcall
Signed-off-by: Jinyang He <hejinyang@loongson.cn> # Provide la_abs relocation code
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Introduce Kconfig option ARCH_STRICT_ALIGN to make -mstrict-align be
configurable.
Not all LoongArch cores support h/w unaligned access, we can use the
-mstrict-align build parameter to prevent unaligned accesses.
CPUs with h/w unaligned access support:
Loongson-2K2000/2K3000/3A5000/3C5000/3D5000.
CPUs without h/w unaligned access support:
Loongson-2K500/2K1000.
This option is enabled by default to make the kernel be able to run on
all LoongArch systems. But you can disable it manually if you want to
run kernel only on systems with h/w unaligned access support in order to
optimise for performance.
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
- More userfaultfs work from Peter Xu.
- Several convert-to-folios series from Sidhartha Kumar and Huang Ying.
- Some filemap cleanups from Vishal Moola.
- David Hildenbrand added the ability to selftest anon memory COW handling.
- Some cpuset simplifications from Liu Shixin.
- Addition of vmalloc tracing support by Uladzislau Rezki.
- Some pagecache folioifications and simplifications from Matthew Wilcox.
- A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use it.
- Miguel Ojeda contributed some cleanups for our use of the
__no_sanitize_thread__ gcc keyword. This series shold have been in the
non-MM tree, my bad.
- Naoya Horiguchi improved the interaction between memory poisoning and
memory section removal for huge pages.
- DAMON cleanups and tuneups from SeongJae Park
- Tony Luck fixed the handling of COW faults against poisoned pages.
- Peter Xu utilized the PTE marker code for handling swapin errors.
- Hugh Dickins reworked compound page mapcount handling, simplifying it
and making it more efficient.
- Removal of the autonuma savedwrite infrastructure from Nadav Amit and
David Hildenbrand.
- zram support for multiple compression streams from Sergey Senozhatsky.
- David Hildenbrand reworked the GUP code's R/O long-term pinning so
that drivers no longer need to use the FOLL_FORCE workaround which
didn't work very well anyway.
- Mel Gorman altered the page allocator so that local IRQs can remnain
enabled during per-cpu page allocations.
- Vishal Moola removed the try_to_release_page() wrapper.
- Stefan Roesch added some per-BDI sysfs tunables which are used to
prevent network block devices from dirtying excessive amounts of
pagecache.
- David Hildenbrand did some cleanup and repair work on KSM COW
breaking.
- Nhat Pham and Johannes Weiner have implemented writeback in zswap's
zsmalloc backend.
- Brian Foster has fixed a longstanding corner-case oddity in
file[map]_write_and_wait_range().
- sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
Chen.
- Shiyang Ruan has done some work on fsdax, to make its reflink mode
work better under xfstests. Better, but still not perfect.
- Christoph Hellwig has removed the .writepage() method from several
filesystems. They only need .writepages().
- Yosry Ahmed wrote a series which fixes the memcg reclaim target
beancounting.
- David Hildenbrand has fixed some of our MM selftests for 32-bit
machines.
- Many singleton patches, as usual.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY5j6ZwAKCRDdBJ7gKXxA
jkDYAP9qNeVqp9iuHjZNTqzMXkfmJPsw2kmy2P+VdzYVuQRcJgEAgoV9d7oMq4ml
CodAgiA51qwzId3GRytIo/tfWZSezgA=
=d19R
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- More userfaultfs work from Peter Xu
- Several convert-to-folios series from Sidhartha Kumar and Huang Ying
- Some filemap cleanups from Vishal Moola
- David Hildenbrand added the ability to selftest anon memory COW
handling
- Some cpuset simplifications from Liu Shixin
- Addition of vmalloc tracing support by Uladzislau Rezki
- Some pagecache folioifications and simplifications from Matthew
Wilcox
- A pagemap cleanup from Kefeng Wang: we have VM_ACCESS_FLAGS, so use
it
- Miguel Ojeda contributed some cleanups for our use of the
__no_sanitize_thread__ gcc keyword.
This series should have been in the non-MM tree, my bad
- Naoya Horiguchi improved the interaction between memory poisoning and
memory section removal for huge pages
- DAMON cleanups and tuneups from SeongJae Park
- Tony Luck fixed the handling of COW faults against poisoned pages
- Peter Xu utilized the PTE marker code for handling swapin errors
- Hugh Dickins reworked compound page mapcount handling, simplifying it
and making it more efficient
- Removal of the autonuma savedwrite infrastructure from Nadav Amit and
David Hildenbrand
- zram support for multiple compression streams from Sergey Senozhatsky
- David Hildenbrand reworked the GUP code's R/O long-term pinning so
that drivers no longer need to use the FOLL_FORCE workaround which
didn't work very well anyway
- Mel Gorman altered the page allocator so that local IRQs can remnain
enabled during per-cpu page allocations
- Vishal Moola removed the try_to_release_page() wrapper
- Stefan Roesch added some per-BDI sysfs tunables which are used to
prevent network block devices from dirtying excessive amounts of
pagecache
- David Hildenbrand did some cleanup and repair work on KSM COW
breaking
- Nhat Pham and Johannes Weiner have implemented writeback in zswap's
zsmalloc backend
- Brian Foster has fixed a longstanding corner-case oddity in
file[map]_write_and_wait_range()
- sparse-vmemmap changes for MIPS, LoongArch and NIOS2 from Feiyang
Chen
- Shiyang Ruan has done some work on fsdax, to make its reflink mode
work better under xfstests. Better, but still not perfect
- Christoph Hellwig has removed the .writepage() method from several
filesystems. They only need .writepages()
- Yosry Ahmed wrote a series which fixes the memcg reclaim target
beancounting
- David Hildenbrand has fixed some of our MM selftests for 32-bit
machines
- Many singleton patches, as usual
* tag 'mm-stable-2022-12-13' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (313 commits)
mm/hugetlb: set head flag before setting compound_order in __prep_compound_gigantic_folio
mm: mmu_gather: allow more than one batch of delayed rmaps
mm: fix typo in struct pglist_data code comment
kmsan: fix memcpy tests
mm: add cond_resched() in swapin_walk_pmd_entry()
mm: do not show fs mm pc for VM_LOCKONFAULT pages
selftests/vm: ksm_functional_tests: fixes for 32bit
selftests/vm: cow: fix compile warning on 32bit
selftests/vm: madv_populate: fix missing MADV_POPULATE_(READ|WRITE) definitions
mm/gup_test: fix PIN_LONGTERM_TEST_READ with highmem
mm,thp,rmap: fix races between updates of subpages_mapcount
mm: memcg: fix swapcached stat accounting
mm: add nodes= arg to memory.reclaim
mm: disable top-tier fallback to reclaim on proactive reclaim
selftests: cgroup: make sure reclaim target memcg is unprotected
selftests: cgroup: refactor proactive reclaim code to reclaim_until()
mm: memcg: fix stale protection of reclaim target memcg
mm/mmap: properly unaccount memory on mas_preallocate() failure
omfs: remove ->writepage
jfs: remove ->writepage
...
Allow for arguments to be passed in to ftrace_regs by default. If this
is set, then arguments and stack can be found from the pt_regs.
1. HAVE_DYNAMIC_FTRACE_WITH_ARGS don't need special hook for graph
tracer entry point, but instead we can use graph_ops::func function to
install the return_hooker.
2. Livepatch requires this option in the future.
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This patch implements CONFIG_DYNAMIC_FTRACE_WITH_REGS on LoongArch,
which allows a traced function's arguments (and some other registers)
to be captured into a struct pt_regs, allowing these to be inspected
and modified.
Co-developed-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The compiler has inserted 2 NOPs before the regular function prologue.
T series registers are available and safe because of LoongArch's psABI.
At runtime, we can replace nop with bl to enable ftrace call and replace
bl with nop to disable ftrace call. The bl instruction requires us to
save the original RA value, so it saves RA at t0 here.
Details are:
| Compiled | Disabled | Enabled |
+------------+------------------------+------------------------+
| nop | move t0, ra | move t0, ra |
| nop | nop | bl ftrace_caller |
| func_body | func_body | func_body |
The RA value will be recovered by ftrace_regs_entry, and restored into
RA before returning to the regular function prologue. When a function is
not being traced, the "move t0, ra" is not harmful.
1) ftrace_make_call, ftrace_make_nop (in kernel/ftrace.c)
The two functions turn each recorded call site of filtered functions
into a call to ftrace_caller or nops.
2) ftracce_update_ftrace_func (in kernel/ftrace.c)
turns the nops at ftrace_call into a call to a generic entry for
function tracers.
3) ftrace_caller (in kernel/mcount_dyn.S)
The entry where each _mcount call sites calls to once they are
filtered to be traced.
Co-developed-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Recordmcount utility under scripts is run, after compiling each object,
to find out all the locations of calling _mcount() and put them into
specific seciton named __mcount_loc.
Then the linker collects all such information into a table in the kernel
image (between __start_mcount_loc and __stop_mcount_loc) for later use
by ftrace.
This patch adds LoongArch specific definitions to identify such locations.
And on LoongArch, only the C version is used to build the kernel now that
CONFIG_HAVE_C_RECORDMCOUNT is on.
Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This patch contains basic ftrace support for LoongArch. Specifically,
function tracer (HAVE_FUNCTION_TRACER), function graph tracer (HAVE_
FUNCTION_GRAPH_TRACER) are implemented following the instructions in
Documentation/trace/ftrace-design.txt.
Use `-pg` makes stub like a child function `void _mcount(void *ra)`.
Thus, it can be seen store RA and alloc stack before `call _mcount`.
Find `alloc stack` at first, and then find `store RA`.
Note that the functions in both inst.c and time.c should not be hooked
with the compiler's -pg option: to prevent infinite self-referencing for
the former, and to ignore early setup stuff for the latter.
Co-developed-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Jinyang He <hejinyang@loongson.cn>
Signed-off-by: Qing Zhang <zhangqing@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add basic stack protector support similar to other architectures. A
constant canary value is set at boot time, and with help of compiler's
-fstack-protector we can detect stack corruption.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Since commit 40cd01a9c324("efi/loongarch: libstub: remove dependency on
flattened DT"), we can parse the FDT from efi system table.
And now, LoongArch is coming to support booting with FDT, so we add the
relevant booting support as well as parameter parsing.
Signed-off-by: Binbin Zhou <zhoubinbin@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Loongson-2 series (Loongson-2K500, Loongson-2K1000) don't support
unaligned access in hardware, while Loongson-3 series (Loongson-3A5000,
Loongson-3C5000) are configurable whether support unaligned access in
hardware. This patch add unaligned access emulation for those LoongArch
processors without hardware support.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The feature of minimizing overhead of struct page associated with each
HugeTLB page is implemented on x86_64. However, the infrastructure of
this feature is already there, so just select ARCH_WANT_HUGETLB_PAGE_
OPTIMIZE_VMEMMAP is enough to enable this feature for LoongArch.
Link: https://lkml.kernel.org/r/20221027125253.3458989-5-chenhuacai@loongson.cn
Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Acked-by: Muchun Song <songmuchun@bytedance.com>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Min Zhou <zhoumin@loongson.cn>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Will Deacon <will@kernel.org>
Cc: Xuefeng Li <lixuefeng@loongson.cn>
Cc: Xuerui Wang <kernel@xen0n.name>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Add sparse memory vmemmap support for LoongArch. SPARSEMEM_VMEMMAP uses a
virtually mapped memmap to optimise pfn_to_page and page_to_pfn
operations. This is the most efficient option when sufficient kernel
resources are available.
Link: https://lkml.kernel.org/r/20221027125253.3458989-3-chenhuacai@loongson.cn
Signed-off-by: Min Zhou <zhoumin@loongson.cn>
Signed-off-by: Feiyang Chen <chenfeiyang@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Reviewed-by: Arnd Bergmann <arnd@arndb.de>
Cc: Andy Lutomirski <luto@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Philippe Mathieu-Daudé <philmd@linaro.org>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Will Deacon <will@kernel.org>
Cc: Xuefeng Li <lixuefeng@loongson.cn>
Cc: Xuerui Wang <kernel@xen0n.name>
Cc: Muchun Song <songmuchun@bytedance.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
The loongarch architecture uses the atomic read-modify-write
amadd instruction to implement this_cpu_add(), which is NMI safe.
This means that the old and more-efficient srcu_read_lock() may be
used in NMI context, without the need for srcu_read_lock_nmisafe().
Therefore, add the new Kconfig option ARCH_HAS_NMI_SAFE_THIS_CPU_OPS
to arch/loongarch/Kconfig, which will cause NEED_SRCU_NMI_SAFE to be
deselected, thus preserving the current srcu_read_lock() behavior.
Link: https://lore.kernel.org/all/20220910221947.171557773@linutronix.de/
Suggested-by: Neeraj Upadhyay <quic_neeraju@quicinc.com>
Suggested-by: Frederic Weisbecker <frederic@kernel.org>
Suggested-by: Boqun Feng <boqun.feng@gmail.com>
Signed-off-by: Paul E. McKenney <paulmck@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: John Ogness <john.ogness@linutronix.de>
Cc: Petr Mladek <pmladek@suse.com>
Cc: <loongarch@lists.linux.dev>
BPF programs are normally handled by a BPF interpreter, add BPF JIT
support for LoongArch to allow the kernel to generate native code when
a program is loaded into the kernel. This will significantly speed-up
processing of BPF programs.
Co-developed-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
This patch adds support for kdump. In kdump case the normal kernel will
reserve a region for the crash kernel and jump there on panic.
Arch-specific functions are added to allow for implementing a crash dump
file interface, /proc/vmcore, which can be viewed as a ELF file.
A user-space tool, such as kexec-tools, is responsible for allocating a
separate region for the core's ELF header within the crash kdump kernel
memory and filling it in when executing kexec_load().
Then, its location will be advertised to the crash dump kernel via a
command line argument "elfcorehdr=", and the crash dump kernel will
preserve this region for later use with arch_reserve_vmcore() at boot
time.
At the same time, the crash kdump kernel is also limited within the
"crashkernel" area via a command line argument "mem=", so as not to
destroy the original kernel dump data.
In the crash dump kernel environment, /proc/vmcore is used to access the
primary kernel's memory with copy_oldmem_page().
I tested kdump on LoongArch machines (Loongson-3A5000) and it works as
expected (suggested crashkernel parameter is "crashkernel=512M@2560M"),
you may test it by triggering a crash through /proc/sysrq-trigger:
$ sudo kexec -p /boot/vmlinux-kdump --reuse-cmdline --append="nr_cpus=1"
# echo c > /proc/sysrq-trigger
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Add three new files, kexec.h, machine_kexec.c and relocate_kernel.S to
the LoongArch architecture, so as to add support for the kexec re-boot
mechanism (CONFIG_KEXEC) on LoongArch platforms.
Kexec supports loading vmlinux.elf in ELF format and vmlinux.efi in PE
format.
I tested kexec on LoongArch machines (Loongson-3A5000) and it works as
expected:
$ sudo kexec -l /boot/vmlinux.efi --reuse-cmdline
$ sudo kexec -e
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Inspired by commit 9fb7410f955("arm64/BUG: Use BRK instruction for
generic BUG traps"), do similar for LoongArch to use generic BUG()
handler.
This patch uses the BREAK software breakpoint instruction to generate
a trap instead, similarly to most other arches, with the generic BUG
code generating the dmesg boilerplate.
This allows bug metadata to be moved to a separate table and reduces
the amount of inline code at BUG() and WARN() sites. This also avoids
clobbering any registers before they can be dumped.
To mitigate the size of the bug table further, this patch makes use of
the existing infrastructure for encoding addresses within the bug table
as 32-bit relative pointers instead of absolute pointers.
(Note: this limits the max kernel size to 2GB.)
Before patch:
[ 3018.338013] lkdtm: Performing direct entry BUG
[ 3018.342445] Kernel bug detected[#5]:
[ 3018.345992] CPU: 2 PID: 865 Comm: cat Tainted: G D 6.0.0-rc6+ #35
After patch:
[ 125.585985] lkdtm: Performing direct entry BUG
[ 125.590433] ------------[ cut here ]------------
[ 125.595020] kernel BUG at drivers/misc/lkdtm/bugs.c:78!
[ 125.600211] Oops - BUG[#1]:
[ 125.602980] CPU: 3 PID: 410 Comm: cat Not tainted 6.0.0-rc6+ #36
Out-of-line file/line data information obtained compared to before.
Signed-off-by: Youling Tang <tangyouling@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
The perf events infrastructure of LoongArch is very similar to old MIPS-
based Loongson, so most of the codes are derived from MIPS.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
We can support more cache attributes (e.g., CC, SUC and WUC) and page
protection when we use TLB for ioremap(). The implementation is based
on GENERIC_IOREMAP.
The existing simple ioremap() implementation has better performance so
we keep it and introduce ARCH_IOREMAP to control the selection.
We move pagetable_init() earlier to make early ioremap() works, and we
modify the PCI ecam mapping because the TLB-based version of ioremap()
will actually take the size into account.
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
Accidental access to /dev/mem is obviously disastrous, but specific
access can be used by people debugging the kernel. So select GENERIC_
LIB_DEVMEM_IS_ALLOWED, as well as define ARCH_HAS_VALID_PHYS_ADDR_RANGE
and related helpers, to support access filter to /dev/mem interface.
Signed-off-by: Weihao Li <liweihao@loongson.cn>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
GNU as >= 2.40 and GCC >= 13 will support using explicit relocation
hints in the assembly code, instead of la.* macros. The usage of
explicit relocation hints can improve code generation so it's enabled
by default by GCC >= 13.
Introduce a Kconfig option AS_HAS_EXPLICIT_RELOCS as the switch for
"use explicit relocation hints or not".
Tested-by: WANG Xuerui <git@xen0n.name>
Signed-off-by: Xi Ruoyao <xry111@xry111.site>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
There is a spelling mistake in a commented section. Fix it.
Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Signed-off-by: Huacai Chen <chenhuacai@loongson.cn>
linux-next for a couple of months without, to my knowledge, any negative
reports (or any positive ones, come to that).
- Also the Maple Tree from Liam R. Howlett. An overlapping range-based
tree for vmas. It it apparently slight more efficient in its own right,
but is mainly targeted at enabling work to reduce mmap_lock contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
(https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com).
This has yet to be addressed due to Liam's unfortunately timed
vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down to
the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to support
file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
-----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCY0HaPgAKCRDdBJ7gKXxA
joPjAQDZ5LlRCMWZ1oxLP2NOTp6nm63q9PWcGnmY50FjD/dNlwEAnx7OejCLWGWf
bbTuk6U2+TKgJa4X7+pbbejeoqnt5QU=
=xfWx
-----END PGP SIGNATURE-----
Merge tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- Yu Zhao's Multi-Gen LRU patches are here. They've been under test in
linux-next for a couple of months without, to my knowledge, any
negative reports (or any positive ones, come to that).
- Also the Maple Tree from Liam Howlett. An overlapping range-based
tree for vmas. It it apparently slightly more efficient in its own
right, but is mainly targeted at enabling work to reduce mmap_lock
contention.
Liam has identified a number of other tree users in the kernel which
could be beneficially onverted to mapletrees.
Yu Zhao has identified a hard-to-hit but "easy to fix" lockdep splat
at [1]. This has yet to be addressed due to Liam's unfortunately
timed vacation. He is now back and we'll get this fixed up.
- Dmitry Vyukov introduces KMSAN: the Kernel Memory Sanitizer. It uses
clang-generated instrumentation to detect used-unintialized bugs down
to the single bit level.
KMSAN keeps finding bugs. New ones, as well as the legacy ones.
- Yang Shi adds a userspace mechanism (madvise) to induce a collapse of
memory into THPs.
- Zach O'Keefe has expanded Yang Shi's madvise(MADV_COLLAPSE) to
support file/shmem-backed pages.
- userfaultfd updates from Axel Rasmussen
- zsmalloc cleanups from Alexey Romanov
- cleanups from Miaohe Lin: vmscan, hugetlb_cgroup, hugetlb and
memory-failure
- Huang Ying adds enhancements to NUMA balancing memory tiering mode's
page promotion, with a new way of detecting hot pages.
- memcg updates from Shakeel Butt: charging optimizations and reduced
memory consumption.
- memcg cleanups from Kairui Song.
- memcg fixes and cleanups from Johannes Weiner.
- Vishal Moola provides more folio conversions
- Zhang Yi removed ll_rw_block() :(
- migration enhancements from Peter Xu
- migration error-path bugfixes from Huang Ying
- Aneesh Kumar added ability for a device driver to alter the memory
tiering promotion paths. For optimizations by PMEM drivers, DRM
drivers, etc.
- vma merging improvements from Jakub Matěn.
- NUMA hinting cleanups from David Hildenbrand.
- xu xin added aditional userspace visibility into KSM merging
activity.
- THP & KSM code consolidation from Qi Zheng.
- more folio work from Matthew Wilcox.
- KASAN updates from Andrey Konovalov.
- DAMON cleanups from Kaixu Xia.
- DAMON work from SeongJae Park: fixes, cleanups.
- hugetlb sysfs cleanups from Muchun Song.
- Mike Kravetz fixes locking issues in hugetlbfs and in hugetlb core.
Link: https://lkml.kernel.org/r/CAOUHufZabH85CeUN-MEMgL8gJGzJEWUrkiM58JkTbBhh-jew0Q@mail.gmail.com [1]
* tag 'mm-stable-2022-10-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (555 commits)
hugetlb: allocate vma lock for all sharable vmas
hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer
hugetlb: fix vma lock handling during split vma and range unmapping
mglru: mm/vmscan.c: fix imprecise comments
mm/mglru: don't sync disk for each aging cycle
mm: memcontrol: drop dead CONFIG_MEMCG_SWAP config symbol
mm: memcontrol: use do_memsw_account() in a few more places
mm: memcontrol: deprecate swapaccounting=0 mode
mm: memcontrol: don't allocate cgroup swap arrays when memcg is disabled
mm/secretmem: remove reduntant return value
mm/hugetlb: add available_huge_pages() func
mm: remove unused inline functions from include/linux/mm_inline.h
selftests/vm: add selftest for MADV_COLLAPSE of uffd-minor memory
selftests/vm: add file/shmem MADV_COLLAPSE selftest for cleared pmd
selftests/vm: add thp collapse shmem testing
selftests/vm: add thp collapse file and tmpfs testing
selftests/vm: modularize thp collapse memory operations
selftests/vm: dedup THP helpers
mm/khugepaged: add tracepoint to hpage_collapse_scan_file()
mm/madvise: add file and shmem support to MADV_COLLAPSE
...
- implement EFI boot support for LoongArch
- implement generic EFI compressed boot support for arm64, RISC-V and
LoongArch, none of which implement a decompressor today
- measure the kernel command line into the TPM if measured boot is in
effect
- refactor the EFI stub code in order to isolate DT dependencies for
architectures other than x86
- avoid calling SetVirtualAddressMap() on arm64 if the configured size
of the VA space guarantees that doing so is unnecessary
- move some ARM specific code out of the generic EFI source files
- unmap kernel code from the x86 mixed mode 1:1 page tables
-----BEGIN PGP SIGNATURE-----
iQGzBAABCgAdFiEE+9lifEBpyUIVN1cpw08iOZLZjyQFAmM5mfEACgkQw08iOZLZ
jySnJwv9G2nBheSlK9bbWKvCpnDvVIExtlL+mg1wB64oxPrGiWRgjxeyA9+92bT0
Y6jYfKbGOGKnxkEJQl19ik6C3JfEwtGm4SnOVp4+osFeDRB7lFemfcIYN5dqz111
wkZA/Y15rnz3tZeGaXnq2jMoFuccQDXPJtOlqbdVqFQ5Py6YT92uMyuI079pN0T+
GSu7VVOX+SBsv4nGaUKIpSVwAP0gXkS/7s7CTf47QiR2+j8WMTlQEYZVjOKZjMJZ
/7hXY2/mduxnuVuT7cfx0mpZKEryUREJoBL5nDzjTnlhLb5X8cHKiaE1lx0aJ//G
JYTR8lDklJZl/7RUw/IW/YodcKcofr3F36NMzWB5vzM+KHOOpv4qEZhoGnaXv94u
auqhzYA83heaRjz7OISlk6kgFxdlIRE1VdrkEBXSlQeCQUv1woS+ZNVGYcKqgR0B
48b31Ogm2A0pAuba89+U9lz/n33lhIDtYvJqLO6AAPLGiVacD9ZdapN5kMftVg/1
SfhFqNzy
=d8Ps
-----END PGP SIGNATURE-----
Merge tag 'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi
Pull EFI updates from Ard Biesheuvel:
"A bit more going on than usual in the EFI subsystem. The main driver
for this has been the introduction of the LoonArch architecture last
cycle, which inspired some cleanup and refactoring of the EFI code.
Another driver for EFI changes this cycle and in the future is
confidential compute.
The LoongArch architecture does not use either struct bootparams or DT
natively [yet], and so passing information between the EFI stub and
the core kernel using either of those is undesirable. And in general,
overloading DT has been a source of issues on arm64, so using DT for
this on new architectures is a to avoid for the time being (even if we
might converge on something DT based for non-x86 architectures in the
future). For this reason, in addition to the patch that enables EFI
boot for LoongArch, there are a number of refactoring patches applied
on top of which separate the DT bits from the generic EFI stub bits.
These changes are on a separate topich branch that has been shared
with the LoongArch maintainers, who will include it in their pull
request as well. This is not ideal, but the best way to manage the
conflicts without stalling LoongArch for another cycle.
Another development inspired by LoongArch is the newly added support
for EFI based decompressors. Instead of adding yet another
arch-specific incarnation of this pattern for LoongArch, we are
introducing an EFI app based on the existing EFI libstub
infrastructure that encapulates the decompression code we use on other
architectures, but in a way that is fully generic. This has been
developed and tested in collaboration with distro and systemd folks,
who are eager to start using this for systemd-boot and also for arm64
secure boot on Fedora. Note that the EFI zimage files this introduces
can also be decompressed by non-EFI bootloaders if needed, as the
image header describes the location of the payload inside the image,
and the type of compression that was used. (Note that Fedora's arm64
GRUB is buggy [0] so you'll need a recent version or switch to
systemd-boot in order to use this.)
Finally, we are adding TPM measurement of the kernel command line
provided by EFI. There is an oversight in the TCG spec which results
in a blind spot for command line arguments passed to loaded images,
which means that either the loader or the stub needs to take the
measurement. Given the combinatorial explosion I am anticipating when
it comes to firmware/bootloader stacks and firmware based attestation
protocols (SEV-SNP, TDX, DICE, DRTM), it is good to set a baseline now
when it comes to EFI measured boot, which is that the kernel measures
the initrd and command line. Intermediate loaders can measure
additional assets if needed, but with the baseline in place, we can
deploy measured boot in a meaningful way even if you boot into Linux
straight from the EFI firmware.
Summary:
- implement EFI boot support for LoongArch
- implement generic EFI compressed boot support for arm64, RISC-V and
LoongArch, none of which implement a decompressor today
- measure the kernel command line into the TPM if measured boot is in
effect
- refactor the EFI stub code in order to isolate DT dependencies for
architectures other than x86
- avoid calling SetVirtualAddressMap() on arm64 if the configured
size of the VA space guarantees that doing so is unnecessary
- move some ARM specific code out of the generic EFI source files
- unmap kernel code from the x86 mixed mode 1:1 page tables"
* tag 'efi-next-for-v6.1' of git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi: (24 commits)
efi/arm64: libstub: avoid SetVirtualAddressMap() when possible
efi: zboot: create MemoryMapped() device path for the parent if needed
efi: libstub: fix up the last remaining open coded boot service call
efi/arm: libstub: move ARM specific code out of generic routines
efi/libstub: measure EFI LoadOptions
efi/libstub: refactor the initrd measuring functions
efi/loongarch: libstub: remove dependency on flattened DT
efi: libstub: install boot-time memory map as config table
efi: libstub: remove DT dependency from generic stub
efi: libstub: unify initrd loading between architectures
efi: libstub: remove pointless goto kludge
efi: libstub: simplify efi_get_memory_map() and struct efi_boot_memmap
efi: libstub: avoid efi_get_memory_map() for allocating the virt map
efi: libstub: drop pointless get_memory_map() call
efi: libstub: fix type confusion for load_options_size
arm64: efi: enable generic EFI compressed boot
loongarch: efi: enable generic EFI compressed boot
riscv: efi: enable generic EFI compressed boot
efi/libstub: implement generic EFI zboot
efi/libstub: move efi_system_table global var into separate object
...