Commit Graph

4038 Commits

Author SHA1 Message Date
Jinjie Ruan
1a74833182
riscv: stacktrace: Add USER_STACKTRACE support
Currently, userstacktrace is unsupported for riscv. So use the
perf_callchain_user() code as blueprint to implement the
arch_stack_walk_user() which add userstacktrace support on riscv.
Meanwhile, we can use arch_stack_walk_user() to simplify the implementation
of perf_callchain_user().

A ftrace test case is shown as below:

	# cd /sys/kernel/debug/tracing
	# echo 1 > options/userstacktrace
	# echo 1 > options/sym-userobj
	# echo 1 > events/sched/sched_process_fork/enable
	# cat trace
	......
	            bash-178     [000] ...1.    97.968395: sched_process_fork: comm=bash pid=178 child_comm=bash child_pid=231
	            bash-178     [000] ...1.    97.970075: <user stack trace>
	 => /lib/libc.so.6[+0xb5090]

Also a simple perf test is ok as below:

	# perf record -e cpu-clock --call-graph fp top
	# perf report --call-graph

	.....
	[[31m  66.54%[[m     0.00%  top      [kernel.kallsyms]            [k] ret_from_exception
            |
            ---ret_from_exception
               |
               |--[[31m58.97%[[m--do_trap_ecall_u
               |          |
               |          |--[[31m17.34%[[m--__riscv_sys_read
               |          |          ksys_read
               |          |          |
               |          |           --[[31m16.88%[[m--vfs_read
               |          |                     |
               |          |                     |--[[31m10.90%[[m--seq_read

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Tested-by: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240708032847.2998158-3-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 23:57:16 -07:00
Jinjie Ruan
22ab08955e
riscv: Fix fp alignment bug in perf_callchain_user()
The standard RISC-V calling convention said:
	"The stack grows downward and the stack pointer is always
	kept 16-byte aligned".

So perf_callchain_user() should check whether 16-byte aligned for fp.

Link: https://riscv.org/wp-content/uploads/2015/01/riscv-calling.pdf

Fixes: dbeb90b0c1 ("riscv: Add perf callchain support")
Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Cc: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/r/20240708032847.2998158-2-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 23:57:15 -07:00
Paolo Bonzini
0cdcc99eea Merge tag 'kvm-riscv-6.12-1' of https://github.com/kvm-riscv/linux into HEAD
KVM/riscv changes for 6.12

- Fix sbiret init before forwarding to userspace
- Don't zero-out PMU snapshot area before freeing data
- Allow legacy PMU access from guest
- Fix to allow hpmcounter31 from the guest
2024-09-15 02:43:17 -04:00
Stuart Menefy
d6a1928134
riscv: Remove redundant restriction on memory size
The original reason for reserving the top 4GiB of the direct map
(space for modules/BPF/kernel) hasn't applied since the address
map was reworked for KASAN.

Signed-off-by: Stuart Menefy <stuart.menefy@codasip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240624121723.2186279-1-stuart.menefy@codasip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 01:08:56 -07:00
Changbin Du
7587a3602b
riscv: vdso: do not strip debugging info for vdso.so.dbg
The vdso.so.dbg is a debug version of vdso and could be used for debugging
purpose. For example, perf-annotate requires debugging info to show source
lines. So let's keep its debugging info.

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Reviewed-by: Cyril Bur <cyrilbur@tenstorrent.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240611040947.3024710-1-changbin.du@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-14 01:02:30 -07:00
Alice Ryhl
d077242d68 rust: support for shadow call stack sanitizer
Add all of the flags that are needed to support the shadow call stack
(SCS) sanitizer with Rust, and updates Kconfig to allow only
configurations that work.

The -Zfixed-x18 flag is required to use SCS on arm64, and requires rustc
version 1.80.0 or greater. This restriction is reflected in Kconfig.

When CONFIG_DYNAMIC_SCS is enabled, the build will be configured to
include unwind tables in the build artifacts. Dynamic SCS uses the
unwind tables at boot to find all places that need to be patched. The
-Cforce-unwind-tables=y flag ensures that unwind tables are available
for Rust code.

In non-dynamic mode, the -Zsanitizer=shadow-call-stack flag is what
enables the SCS sanitizer. Using this flag requires rustc version 1.82.0
or greater on the targets used by Rust in the kernel. This restriction
is reflected in Kconfig.

It is possible to avoid the requirement of rustc 1.80.0 by using
-Ctarget-feature=+reserve-x18 instead of -Zfixed-x18. However, this flag
emits a warning during the build, so this patch does not add support for
using it and instead requires 1.80.0 or greater.

The dependency is placed on `select HAVE_RUST` to avoid a situation
where enabling Rust silently turns off the sanitizer. Instead, turning
on the sanitizer results in Rust being disabled. We generally do not
want changes to CONFIG_RUST to result in any mitigations being changed
or turned off.

At the time of writing, rustc 1.82.0 only exists via the nightly release
channel. There is a chance that the -Zsanitizer=shadow-call-stack flag
will end up needing 1.83.0 instead, but I think it is small.

Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Kees Cook <kees@kernel.org>
Acked-by: Will Deacon <will@kernel.org>
Signed-off-by: Alice Ryhl <aliceryhl@google.com>
Link: https://lore.kernel.org/r/20240829-shadow-call-stack-v7-1-2f62a4432abf@google.com
[ Fixed indentation using spaces. - Miguel ]
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2024-09-13 00:03:14 +02:00
Linus Torvalds
8581ae1ea0 RISC-V Fixes for 6.11-rc8
* Two fixes for smp_processor_id() calls in preemptible sections: one if
   the perf driver, and one in the fence.i prctl.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmbjDTkTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiRsKD/428yAcDT6JKjG+jzYnMARDxl+OS0hJ
 KKwfeglvpPXWqZoZlyhPgJMJ/fNalrH/dVwl01a3Aobjy8/JV6W6L2Cc/lwRBNXv
 O+nfCzzJoZs+v7D0MR2Ad101vfToafDaiZ9Q6jH4/9yi/m5zAaIHdEnjYQQW5rLc
 FMnfGDHXo40WkrnjEjefeh7n88d5dtyD3szwS8khSnJulGFRHHWW0XL54wVeY3wE
 NvLKdD5PDR7kzOE6BvofPQ1tIJxpWnE+jqSgHmS1eYKlwwzb1kvtGXiktUl/gcwb
 /+jFgK49gqWMpx1ji+utw4R357uIZY0DPIiMvY7K5PjeY+TXtltTBgfyvjklpLWm
 YrZ1snpMiMl/VTxFYVFMCvowK/DvylvSQ737zwMaxTq/DGwvr8GPuRO1Pnwc/3QZ
 wCLy3rsK56LE46+9IrYeauYNiOl1bd6znuASzrPhr8Ecexm2DW+JZqRRZCMl3UTK
 p9hrr3CxUT4hi0bksjkSOBks7cLNp3Rqa/fEBtM6YKKzSU6NlEoNpuiH7sjh7ETE
 Zk4qUjRPiRcje0XL2NTKfpKSGuxK/xP4188aYA5jZ1QxO/zditk8BrS5WRfgEoPr
 gjNapO25nrnJbgi9PaClMdeYNV//Fi/0vjpDjw6pYfJaZSE0sWMqp5/bWt5Bsy2A
 e/OyJ+U4TPY4vQ==
 =pEDD
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - Two fixes for smp_processor_id() calls in preemptible sections: one
   if the perf driver, and one in the fence.i prctl.

* tag 'riscv-for-linus-6.11-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF
  drivers: perf: Fix smp_processor_id() use in preemptible code
2024-09-12 13:03:45 -07:00
Palmer Dabbelt
9ea7b92b77
Merge patch series "remove size limit on XIP kernel"
Nam Cao <namcao@linutronix.de> says:

Hi,

For XIP kernel, the writable data section is always at offset specified in
XIP_OFFSET, which is hard-coded to 32MB.

Unfortunately, this means the read-only section (placed before the
writable section) is restricted in size. This causes build failure if the
kernel gets too large.

This series remove the use of XIP_OFFSET one by one, then remove this
macro entirely at the end, with the goal of lifting this size restriction.

Also some cleanup and documentation along the way.

* b4-shazam-merge
  riscv: remove limit on the size of read-only section for XIP kernel
  riscv: drop the use of XIP_OFFSET in create_kernel_page_table()
  riscv: drop the use of XIP_OFFSET in kernel_mapping_va_to_pa()
  riscv: drop the use of XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET
  riscv: drop the use of XIP_OFFSET in XIP_FIXUP_OFFSET
  riscv: replace misleading va_kernel_pa_offset on XIP kernel
  riscv: don't export va_kernel_pa_offset in vmcoreinfo for XIP kernel
  riscv: cleanup XIP_FIXUP macro
  riscv: change XIP's kernel_map.size to be size of the entire kernel
  ...

Link: https://lore.kernel.org/r/cover.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:23:05 -07:00
Nam Cao
b635a84bde
riscv: remove limit on the size of read-only section for XIP kernel
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size. This causes
build failures if the kernel gets too big [1].

Remove this limit.

Reported-by: kernel test robot <lkp@intel.com>
Closes: https://lore.kernel.org/oe-kbuild-all/202404211031.J6l2AfJk-lkp@intel.com [1]
Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/3bf3a77be10ebb0d8086c028500baa16e7a8e648.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:23:02 -07:00
Nam Cao
a7cfb99943
riscv: drop the use of XIP_OFFSET in create_kernel_page_table()
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded value entirely, stop using
XIP_OFFSET in create_kernel_page_table(). Instead use _sdata and _start to
do the same thing.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/4ea3f222a7eb9f91c04b155ff2e4d3ef19158acc.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:23:01 -07:00
Nam Cao
75fdf791df
riscv: drop the use of XIP_OFFSET in kernel_mapping_va_to_pa()
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded macro XIP_OFFSET entirely,
remove the use of XIP_OFFSET in kernel_mapping_va_to_pa(). The macro
XIP_OFFSET is used in this case to check if the virtual address is mapped
to Flash or to RAM. The same check can be done with kernel_map.xiprom_sz.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/644c13d9467525a06f5d63d157875a35b2edb4bc.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:23:00 -07:00
Nam Cao
23311f57ee
riscv: drop the use of XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded macro XIP_OFFSET entirely, stop
using XIP_OFFSET in XIP_FIXUP_FLASH_OFFSET. Instead, use __data_loc and
_sdata to do the same thing.

While at it, also add a description for XIP_FIXUP_FLASH_OFFSET.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/7b3319657edd1822f3457e7e7c07aaa326cc2f87.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:22:59 -07:00
Nam Cao
e4eac34fed
riscv: drop the use of XIP_OFFSET in XIP_FIXUP_OFFSET
XIP_OFFSET is the hard-coded offset of writable data section within the
kernel.

By hard-coding this value, the read-only section of the kernel (which is
placed before the writable data section) is restricted in size.

As a preparation to remove this hard-coded macro XIP_OFFSET entirely, stop
using XIP_OFFSET in XIP_FIXUP_OFFSET. Instead, use CONFIG_PHYS_RAM_BASE and
_sdata to do the same thing.

While at it, also add a description for XIP_FIXUP_OFFSET.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/dba0409518b14ee83b346e099b1f7f934daf7b74.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:22:58 -07:00
Nam Cao
5cf0896721
riscv: replace misleading va_kernel_pa_offset on XIP kernel
On XIP kernel, the name "va_kernel_pa_offset" is misleading: unlike
"normal" kernel, it is not the virtual-physical address offset of kernel
mapping, it is the offset of kernel mapping's first virtual address to
first physical address in DRAM, which is not meaningful because the
kernel's first physical address is not in DRAM.

For XIP kernel, there are 2 different offsets because the read-only part of
the kernel resides in ROM while the rest is in RAM. The offset to ROM is in
kernel_map.va_kernel_xip_pa_offset, while the offset to RAM is not stored
anywhere: it is calculated on-the-fly.

Remove this confusing "va_kernel_pa_offset" and add
"va_kernel_xip_data_pa_offset" as its replacement. This new variable is the
offset of virtual mapping of the kernel's data portion to the corresponding
physical addresses.

With the introduction of this new variable, also rename
va_kernel_xip_pa_offset -> va_kernel_xip_text_pa_offset to make it clear
that this one is about the .text section.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/84e5d005c1386d88d7b2531e0b6707ec5352ee54.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:22:57 -07:00
Nam Cao
f2df5b4fdd
riscv: don't export va_kernel_pa_offset in vmcoreinfo for XIP kernel
The crash utility uses va_kernel_pa_offset to translate virtual addresses.
This is incorrect in the case of XIP kernel, because va_kernel_pa_offset is
not the virtual-physical address offset (yes, the name is misleading; this
variable will be removed for XIP in a following commit).

Stop exporting this variable for XIP kernel. The replacement is to be
determined, note it as a TODO for now.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/8f8760d3f9a11af4ea0acbc247e4f49ff5d317e9.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:22:56 -07:00
Nam Cao
aa3457f22f
riscv: cleanup XIP_FIXUP macro
The XIP_FIXUP macro is used to fix addresses early during boot before MMU:
generated code "thinks" the data section is in ROM while it is actually in
RAM. So this macro corrects the addresses in the data section.

This macro determines if the address needs to be fixed by checking if it is
within the range starting from ROM address up to the size of (2 *
XIP_OFFSET).

This means if the kernel size is bigger than (2 * XIP_OFFSET), some
addresses would not be fixed up.

XIP kernel can still work if the above scenario does not happen. But this
macro is obviously incorrect.

Rewrite this macro to only fix up addresses within the data section.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/95f50a4ec8204ec4fcbf2a80c9addea0e0609e3b.1717789719.git.namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-12 07:22:55 -07:00
Rafael J. Wysocki
45de40574f Merge branch 'acpi-riscv'
Merge ACPI and irqchip updates related to external interrupt controller
support on RISC-V:

 - Add ACPI device enumeration support for interrupt controller probing
   including taking dependencies into account (Sunil V L).

 - Implement ACPI-based interrupt controller probing on RISC-V (Sunil V L).

 - Add ACPI support for AIA in riscv-intc and add ACPI support to
   riscv-imsic, riscv-aplic, and sifive-plic (Sunil V L).

* acpi-riscv:
  irqchip/sifive-plic: Add ACPI support
  irqchip/riscv-aplic: Add ACPI support
  irqchip/riscv-imsic: Add ACPI support
  irqchip/riscv-imsic-state: Create separate function for DT
  irqchip/riscv-intc: Add ACPI support for AIA
  ACPI: RISC-V: Implement function to add implicit dependencies
  ACPI: RISC-V: Initialize GSI mapping structures
  ACPI: RISC-V: Implement function to reorder irqchip probe entries
  ACPI: RISC-V: Implement PCI related functionality
  ACPI: pci_link: Clear the dependencies after probe
  ACPI: bus: Add RINTC IRQ model for RISC-V
  ACPI: scan: Define weak function to populate dependencies
  ACPI: scan: Add RISC-V interrupt controllers to honor list
  ACPI: scan: Refactor dependency creation
  ACPI: bus: Add acpi_riscv_init() function
  ACPI: scan: Add a weak arch_sort_irqchip_probe() to order the IRQCHIP probe
  arm64: PCI: Migrate ACPI related functions to pci-acpi.c
2024-09-11 21:44:22 +02:00
Linus Torvalds
77f5878967 ARM: SoC fixes for 6.11, part 3
The bulk of the changes this time are for device tree files in the
 rockchips platform, addressing correctness issues on individual
 boards, plus one change in the rk356x SoC file to make it match
 the binding.
 
 The only other changes that came in are
 
  - a CPU frequencey scaling fix for JH7110 (RISC-V)
  - a build fix for the cznic hwrandom driver
  - a fix for a deadlock in qualcomm uefi secure
    application firmware driver
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmbhsaQACgkQYKtH/8kJ
 Uif7IBAAwTdAjhxjiBk5ktIkrhNRG9VCvXDwr9Ji70yphxufSIZDQSa+pdN5EFyR
 EgWCTQkZ2Ctw7s9lQmC0Tu3Lxmpu5Q7838UJqlEon2K2yJkKfLG+OSqSwNU6l44u
 GcTND3kGUDTm+uot5Ne0F7dnPiLBWmithKa75/TIoza07Ekz4ynb2fYRuW6JURZ4
 6WCTziL0jCFcUALtjibgno3lG06AwrQWEKd3n53ws9ttnNtpWMzfkDnuF4dBcPod
 vEmTOIaJkEKV80nwupZw8aCKGxe8mARej2kGPZgm9heNfnQk/V7e/1wCm0q8tw3t
 kfUYLN9I/aXTcZyLwixCAVeWhCtONrYBHbZTjXVO2bONtGatiQKgJ1LwhjAOUVMV
 iG20E3P+9OIZ3VIQ9k0Sc3Ys3Sw9Vdd9y01pzv+SyzewnI0h9qHXOrkChx36iwSH
 wsJ9vqZUtLgxcYDYR9JEBEfK9Qaz7X59xtfw75jbiQDzIitvATxA+7HT+7/UPDuA
 d6y5e9i/27+UebImNNtK1+XgHH0qkdBOFA7CHWsvijKgI2GiNkFa1CALBio37dVz
 IBGMFTTHPsCKFiTfy7d4O6VgUeUOjhWXbVEScE7QoHmaQQZ8w0MD6uvOJ8m4cIAt
 V2xbw1EUplldhwRQtsUbsYbopkJBzk+1AWGcIx5ccd3nWCIJkew=
 =Waa0
 -----END PGP SIGNATURE-----

Merge tag 'arm-fixes-6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull ARM SoC fixes from Arnd Bergmann:
 "The bulk of the changes this time are for device tree files in the
  rockchips platform, addressing correctness issues on individual
  boards, plus one change in the rk356x SoC file to make it match the
  binding.

  The only other changes that came in are

   - a CPU frequencey scaling fix for JH7110 (RISC-V)

   - a build fix for the cznic hwrandom driver

   - a fix for a deadlock in qualcomm uefi secure application firmware
     driver"

* tag 'arm-fixes-6.11-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  platform: cznic: turris-omnia-mcu: fix HW_RANDOM dependency
  riscv: dts: starfive: jh7110-common: Fix lower rate of CPUfreq by setting PLL0 rate to 1.5GHz
  firmware: qcom: uefisecapp: Fix deadlock in qcuefi_acquire()
  arm64: dts: rockchip: Fix compatibles for RK3588 VO{0,1}_GRF
  dt-bindings: soc: rockchip: Fix compatibles for RK3588 VO{0,1}_GRF
  arm64: dts: rockchip: override BIOS_DISABLE signal via GPIO hog on RK3399 Puma
  arm64: dts: rockchip: fix eMMC/SPI corruption when audio has been used on RK3399 Puma
  arm64: dts: rockchip: fix PMIC interrupt pin in pinctrl for ROCK Pi E
  arm64: dts: rockchip: Remove broken tsadc pinctrl binding for rk356x
2024-09-11 11:26:56 -07:00
Arnd Bergmann
3c557d0062 RISC-V config for v6.12
Two patches, enabling clock and pinctrl support in defconfig for Sopghgo
 devices.
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZuDAZwAKCRB4tDGHoIJi
 0pMkAQCnzUqAlnSbaXlAMvnTvAA370PysOIOYIg5fiUC45LqGwD9ELhGn7w2+BGe
 yHUYieCn70X00cl+DJSB2Boa06gd4QQ=
 =jq6X
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmbhXPEACgkQYKtH/8kJ
 UidjzQ//UTUxupOLz1XHg7cGj/75WBsVcuQA3e7KXRV1OwNJ3PeJGOufFB1XZcKF
 bF+oRUR3jARFrpSoQFvCjxpl4Q7fFVXsGC312F2amFTs0DYK+mpkFhLd0uO3PJYB
 jQ5IDdasSboSiMl4j8TiAFxqGPM3kOPZzX3+KtLzlZJzm9U2excYagg2wiJ66Z7b
 CtPCV0yDHYR06HhSnHFW2BYlPV1r1U+dtC59/eWq9ucb3ItrSlcaiD9eS6iHVo7E
 3QUFcMfcgEJ9yOZ27BmESZFbw4ZsB6eYdK39RTHmy9I3LzbvLp57AgPGh461tuAR
 s4PmqN4Jd0u+RbTCmAI9CCDM4Vy1qvocvp9HXDwJa5NJx87YiQee1hYu2wuni+0I
 W5/202tAwfqMHtFm/bn9JG62rOiD0jNFYLDmJ8CILJ8bv4b5VfmEvKUqE0w0pNYD
 DxGKlEOKFyNOl3P62o3OtbtQhNWmDdZrZuBrOw8Bl9dsoKjxTmq9S5lrAUii6NMv
 gTU1J1af1ONyk/IFXB+20q6u24q7FUnIDSCgevMzDPNDOZPLloA60PinA60daGCn
 ZWMiAa5WWNSi2TMBYa1Ve8hs017YC0bBBYSPY5VUMRC3emItEmYF2nExzZcMIqFE
 YUtxXzoOYfVOnSN+ShYe5fwa3d+xP5g4rs5JNRvWFVCIjZfyQfU=
 =Ihjw
 -----END PGP SIGNATURE-----

Merge tag 'riscv-config-for-v6.12' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/defconfig

RISC-V config for v6.12

Two patches, enabling clock and pinctrl support in defconfig for Sopghgo
devices.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-config-for-v6.12' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: defconfig: Enable pinctrl support for CV18XX Series SoC
  riscv: defconfig: sophgo: enable clks for sg2042

Link: https://lore.kernel.org/r/20240910-annex-ravage-07d63041a7c5@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-09-11 09:03:44 +00:00
Arnd Bergmann
0e7af99aef RISC-V soc fixes for v6.11-final
StarFive:
 A fix to return one of the clocks on the JH7110 from 1 GHz to 1.5 GHz
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZt8G1wAKCRB4tDGHoIJi
 0q/hAQD++rcYdOcP4iwguSS3ZkbCiyMdLDUuVvSiHUtR5dS0WgEAvXE1i0eEwt63
 BELsrBNFkeUhLrBfHtN0k2MNvfPXxw4=
 =VjNw
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmbhWs4ACgkQYKtH/8kJ
 UieKcRAAyg582dYBmQr2z2u+X0XwR85nPCmwtZT7j1DbR+knBLt8s0+JqNu0g3sn
 NFg14CI3CtaTS96JaFOzXHKpD96oVTyozs0AU5jtCmD0/+RmOXIByrc1hMzRCP8C
 RNhFTwuuQsB3aP56EhL07CAwpTE0lZSdXORtQMn+vJ+H4RF5n0fzjujGLXWbEpOZ
 8tmIFMEkgEILNcRxPRDLsa4hwUhqNiQDBNsV+QtDaaRfPUBHnXdfv88xkXZCX7EF
 wZRi+WobYbAIVIdEfQ01DfsjSKDmief3kZvm7nj4pBV0QP8O3sOe+xFrKcUguBwb
 6tTACRzC9CaJvRNnkiHGvTMuT98500kd5P7TEnBZKgmHlWWYAxOivYO/mu/WYxVK
 Erb/i08XHopNK46xm1Ue/AM3eeXw6I6lAeZquEREZK2zGS6DE/LHTxHIlqelGEHt
 9ubBlJLf6IVsmlzNrOyN7lrKkXhHqTkM5O8o3RCckTpsNrmV2TppjOl4s4FKWRPM
 o91f/FrS4CV8QdA7JQdmgB+pQWAE83txbTP5KAZpxEkTDBEd6b1NKllxXe3OD68y
 QA8B7lEw4ltxnzyD+WI4RIhYoBB+1IfguDy6CbtJ6gzKbmszfCXaxSCA1HpBbmPQ
 ilGFdrEFgT9NDbxHXOWsG/MHMjgOsi3e/h4XN/iYAjN0wwQ6ZSY=
 =1HB8
 -----END PGP SIGNATURE-----

Merge tag 'riscv-soc-fixes-for-v6.11-final' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes

RISC-V soc fixes for v6.11-final

StarFive:
A fix to return one of the clocks on the JH7110 from 1 GHz to 1.5 GHz

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-soc-fixes-for-v6.11-final' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: dts: starfive: jh7110-common: Fix lower rate of CPUfreq by setting PLL0 rate to 1.5GHz

Link: https://lore.kernel.org/r/20240909-hybrid-groovy-601a33b5b309@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-09-11 08:54:37 +00:00
Charlie Jenkins
7c1e5b9690
riscv: Disable preemption while handling PR_RISCV_CTX_SW_FENCEI_OFF
The icache will be flushed in switch_to() if force_icache_flush is true,
or in flush_icache_deferred() if icache_stale_mask is set. Between
setting force_icache_flush to false and calculating the new
icache_stale_mask, preemption needs to be disabled. There are two
reasons for this:

1. If CPU migration happens between force_icache_flush = false, and the
   icache_stale_mask is set, an icache flush will not be emitted.
2. smp_processor_id() is used in set_icache_stale_mask() to mark the
   current CPU as not needing another flush since a flush will have
   happened either by userspace or by the kernel when performing the
   migration. smp_processor_id() is currently called twice with preemption
   enabled which causes a race condition. It allows
   icache_stale_mask to be populated with inconsistent CPU ids.

Resolve these two issues by setting the icache_stale_mask before setting
force_icache_flush to false, and using get_cpu()/put_cpu() to obtain the
smp_processor_id().

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Fixes: 6b9391b581 ("riscv: Include riscv_set_icache_flush_ctx prctl")
Link: https://lore.kernel.org/r/20240903-fix_fencei_optimization-v2-1-8025f20171fc@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-10 20:38:46 -07:00
Heikki Krogerus
0175b1d3c6 RISC-V: configs: enable I2C_DESIGNWARE_CORE with I2C_DESIGNWARE_PLATFORM
The dependency handling of the Synopsys DesignWare I2C
adapter drivers is going to be changed so that the glue
drivers for the PCI and platform buses depend on
I2C_DESIGNWARE_CORE.

Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: linux-riscv@lists.infradead.org
Signed-off-by: Heikki Krogerus <heikki.krogerus@linux.intel.com>
Acked-by: Jarkko Nikula <jarkko.nikula@linux.intel.com>
Signed-off-by: Andi Shyti <andi.shyti@kernel.org>
2024-09-10 00:36:53 +02:00
Inochi Amaoto
72160ec6cb riscv: defconfig: Enable pinctrl support for CV18XX Series SoC
Enable pinctrl driver for the whole CV18XX series.

Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-09-09 12:55:53 +01:00
Xingyu Wu
61f2e8a3a9 riscv: dts: starfive: jh7110-common: Fix lower rate of CPUfreq by setting PLL0 rate to 1.5GHz
CPUfreq supports 4 cpu frequency loads on 375/500/750/1500MHz.
But now PLL0 rate is 1GHz and the cpu frequency loads become
250/333/500/1000MHz in fact.

The PLL0 rate should be default set to 1.5GHz and set the
cpu_core rate to 500MHz in safe.

Fixes: e2c510d6d6 ("riscv: dts: starfive: Add cpu scaling for JH7110 SoC")
Signed-off-by: Xingyu Wu <xingyu.wu@starfivetech.com>
Reviewed-by: Hal Feng <hal.feng@starfivetech.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-09-08 23:20:19 +01:00
Arnd Bergmann
8456010c95 RISC-V Devicetrees for v6.12
Sopgho:
 Added DMA controller for CV18XX.
 Added I2C, MMC, GPIO and onboard MCU (HWMON) for SG2042.
 Enable SDHCI0 for HuashanPi (using cv1812h).
 Some minor changes about dt-bindings for Sipeed LicheeRV Nano board
 (using SG2002, and SG2002 is the new codename of CV181xC).
 
 Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEdoBX2jyDC9ZCTwZjDCzASqG0i0IFAmbXrIgACgkQDCzASqG0
 i0L7IQv/fJPlyfiItMzuJvP75j/Zn/vSQKgrnd2Dbg+nfvIZ5f2CSb0L0zAlZNYX
 E3ToP/MCif55Fom5ySqbtN2YIAiV48PWjejAgztbw0VVcdGAM8q+7opnww/rhwUH
 pgxgse0/K2qn3oerEWZUEGeokqZ9/SmjBAZBK8Nuw4yDpb9HsAKL0trFMIBF1FTl
 VGSi6vGqkaHNFAfs5O2M/NClH53owQZ5ZZsL0tC4Tpuv/lIxYwvywLi8bhXDB29J
 VHP0A5iyHu1QXSyKWhSmxddCdTmG0On//GBJMOeHJOAIGYVg71TzZ2M7zhUVmyYy
 GrzqsZTWMgpfJlNz2olfyUiGYq6kflKgUBe0yiBd/GAJOhHyYmPY5kAMPu+oJOtV
 7Z3DSjCCs2XFSOdlL6kAU0SIqP4UNYpuNLlT2/A2CLf+5C/QyymVD+o5Fe6d9V3K
 fw55vSazM3OF3kPlZqKkBxTbXs5wZjdThteqbUQu5lzcL4jgUfzKq9bug+Uc2l7w
 0OHU/cHf
 =5Y3u
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmbZhPkACgkQYKtH/8kJ
 UifV4xAAnqnCGQ+aWXFhZMMc2Bq+AfgffKY+qg58E5ms6VY+5wXTxbG9qhvLQ5Gf
 RFAzs4OP/qTVr4Qpu2xa82Uv4MDaRPR/qcUXd4xwgB5NTaaFdF/Bz7O7EBl+ZuZf
 vsBD4NUEmDAiUCiBcaR8IICu0Riszkl1mNNPovPrzGOw0T2MxlZa7epFV/msKZXv
 E3zt5GL6HQ3EvvRqzDA7giYV7mx0BMh+PflOlf4OkhfkFguG5CedUlLztt5HULKD
 9bgbINPiOzp4bAEyyEoqywEVAbCPtUZbSvIjmuNXKvclAbQVWFcwULP5AZNs+KEQ
 DanPQ4TqmtGjD4vwVlz/EKHV3Zgq4S28gVnC2KIKbfS4mbZ/siuPVXMnXIb2caPr
 jKcGPBoEZpZBSGmyGHXAMlpg8mgQOUX8XyuwlMSqGm1JUagjLa38Fw9oz17OF7Ot
 CbEZQjF+WL94OMtaKj22wLr/2uBzrNDI0gW8L3OgCk1fdHHtfEbIn8w5tKi/FZCb
 HIHjtyssZTIvsqr8bgEb6ItShcv2yCNjjqCe86E75pSr44qMt/d0r9OGdIGF3uBa
 CFGbrVhVWUsTYfL8ihsh7r2rgcy89UfJL8+ubHFLQo9JS8gge/0z6iFloDSVxlcO
 3fEcBl6b/f6tn/kJlRHMBzhcrRmnMwavzYIbXixXXi6vOzWIN24=
 =iF4q
 -----END PGP SIGNATURE-----

Merge tag 'riscv-sophgo-dt-for-6.12' of https://github.com/sophgo/linux into soc/dt

RISC-V Devicetrees for v6.12

Sopgho:
Added DMA controller for CV18XX.
Added I2C, MMC, GPIO and onboard MCU (HWMON) for SG2042.
Enable SDHCI0 for HuashanPi (using cv1812h).
Some minor changes about dt-bindings for Sipeed LicheeRV Nano board
(using SG2002, and SG2002 is the new codename of CV181xC).

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>

* tag 'riscv-sophgo-dt-for-6.12' of https://github.com/sophgo/linux:
  dt-bindings: riscv: Add Sipeed LicheeRV Nano board compatibles
  dt-bindings: interrupt-controller: Add SOPHGO SG2002 plic
  riscv: dts: sophgo: Add mcu device for Milk-V Pioneer
  riscv: sophgo: dts: add gpio controllers for SG2042 SoC
  riscv: sophgo: dts: add mmc controllers for SG2042 SoC
  riscv: dts: sophgo: Add i2c device support for sg2042
  riscv: dts: sophgo: Use common "interrupt-parent" for all peripherals for sg2042
  riscv: dts: sophgo: Add sdhci0 configuration for Huashan Pi
  riscv: dts: sophgo: cv18xx: add DMA controller

Link: https://lore.kernel.org/r/MA0P287MB28228F4FC59B057DF57D9A11FE9C2@MA0P287MB2822.INDP287.PROD.OUTLOOK.COM
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-09-05 10:16:25 +00:00
Sean Christopherson
071f24ad28 KVM: Rename arch hooks related to per-CPU virtualization enabling
Rename the per-CPU hooks used to enable virtualization in hardware to
align with the KVM-wide helpers in kvm_main.c, and to better capture that
the callbacks are invoked on every online CPU.

No functional change intended.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Kai Huang <kai.huang@intel.com>
Message-ID: <20240830043600.127750-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-09-04 11:02:33 -04:00
Mike Rapoport (Microsoft)
46bcce5031 arch, mm: move definition of node_data to generic code
Every architecture that supports NUMA defines node_data in the same way:

	struct pglist_data *node_data[MAX_NUMNODES];

No reason to keep multiple copies of this definition and its forward
declarations, especially when such forward declaration is the only thing
in include/asm/mmzone.h for many architectures.

Add definition and declaration of node_data to generic code and drop
architecture-specific versions.

Link: https://lkml.kernel.org/r/20240807064110.1003856-8-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Acked-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Acked-by: Davidlohr Bueso <dave@stgolabs.net>
Tested-by: Zi Yan <ziy@nvidia.com> # for x86_64 and arm64
Tested-by: Jonathan Cameron <Jonathan.Cameron@huawei.com> [arm64 + CXL via QEMU]
Acked-by: Dan Williams <dan.j.williams@intel.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Rafael J. Wysocki <rafael@kernel.org>
Cc: Rob Herring (Arm) <robh@kernel.org>
Cc: Samuel Holland <samuel.holland@sifive.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-03 21:15:28 -07:00
Alexandre Ghiti
1ff95eb2be
riscv: Fix RISCV_ALTERNATIVE_EARLY
RISCV_ALTERNATIVE_EARLY will issue sbi_ecall() very early in the boot
process, before the first memory mapping is setup so we can't have any
instrumentation happening here.

In addition, when the kernel is relocatable, we must also not issue any
relocation this early since they would have been patched virtually only.

So, instead of disabling instrumentation for the whole kernel/sbi.c file
and compiling it with -fno-pie, simply move __sbi_ecall() and
__sbi_base_ecall() into their own file where this is fixed.

Reported-by: Conor Dooley <conor.dooley@microchip.com>
Closes: https://lore.kernel.org/linux-riscv/20240813-pony-truck-3e7a83e9759e@spud/
Reported-by: syzbot+cfbcb82adf6d7279fd35@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/linux-riscv/00000000000065062c061fcec37b@google.com/
Fixes: 1745cfafeb ("riscv: don't use global static vars to store alternative data")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240829165048.49756-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-03 07:57:55 -07:00
Alexandre Ghiti
5f771088a2
riscv: Do not restrict memory size because of linear mapping on nommu
It makes no sense to restrict physical memory size because of linear
mapping size constraints when there is no linear mapping, so only do
that when mmu is enabled.

Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
Closes: https://lore.kernel.org/linux-riscv/CAMuHMdW0bnJt5GMRtOZGkTiM7GK4UaLJCDMF_Ouq++fnDKi3_A@mail.gmail.com/
Fixes: 3b6564427a ("riscv: Fix linear mapping checks for non-contiguous memory regions")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Geert Uytterhoeven <geert+renesas@glider.be>
Link: https://lore.kernel.org/r/20240827065230.145021-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-03 07:57:27 -07:00
Anton Blanchard
5ba7a75a53
riscv: Fix toolchain vector detection
A recent change to gcc flags rv64iv as no longer valid:

   cc1: sorry, unimplemented: Currently the 'V' implementation
   requires the 'M' extension

and as a result vector support is disabled. Fix this by adding m
to our toolchain vector detection code.

Signed-off-by: Anton Blanchard <antonb@tenstorrent.com>
Fixes: fa8e7cce55 ("riscv: Enable Vector code to be built")
Link: https://lore.kernel.org/r/20240819001131.1738806-1-antonb@tenstorrent.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-03 07:57:05 -07:00
Charlie Jenkins
4ffc8a3422
riscv: Add license to vmalloc.h
Add a missing license to vmalloc.h.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240729-riscv_fence_license-v1-2-7d5648069640@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-03 07:18:34 -07:00
Charlie Jenkins
097c72e1f2
riscv: Add license to fence.h
Add a missing license to fence.h.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240729-riscv_fence_license-v1-1-7d5648069640@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-09-03 07:18:33 -07:00
Arnd Bergmann
d846d5f1ba T-HEAD Devicetrees for v6.12
Add SPI controller node to th1520.dtsi and enable spi0 on the BeagleV
 Ahead and LicheePi 4A.
 
 The TH1520 AP_SYS clock driver landed in v6.11 so convert multiple
 peripherals like mmc and uart from fixed clocks to the clock controller.
 
 All of these patches have been successfully tested in the latest
 linux-next releases.
 
 Signed-off-by: Drew Fustini <drew@pdp7.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSy8G7QpEpV9aCf6Lbb7CzD2SixDAUCZsSs8AAKCRDb7CzD2Six
 DAAWAQDtP1ptLdtgKZrNi6owuzDh+aJUBcLCb8XobohMhHn/CgEA+9m6uArj0BxA
 FG5xrk6uh+/ueFRCRbU8Bx+PAnHgOQI=
 =VHfU
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmbW4+MACgkQYKtH/8kJ
 Uiezfg//QTJLzANaoQou548RBX4aqs1mV7s8YTuyPzg9WTaJ2pgRO4gJZIm9pBxe
 SBosoNl2mbEEcpD6+V/W7raCxquliS76HyAaFq+379RfN30bGIe1YX7fDaR9DFuW
 HAGVk+skgjZDZBeM2sjFF6ljK6O1ew8lyqNToTIsjwBLELCL/47NcKT/F0HDsCE9
 CedTPmtMpFszLRDoenkuK5eI1OHJFWQd/1O87bdl4LhbPKBBrK/HHyY64DwUYkgn
 k0lJXLfvE3w1mzF5Q5TCBNy/fbHATD3Akbwnv1YOKbScB1hdzsgdoJibkA9ty2ln
 qeubp/H5lTss/vCiU7SQic3b9as6g5pneLM+v2RuN/BLX422HeV1N2DMxx07S4Ah
 +9677pL7WZ2g8DM6u34ZPlZyUoj0ThlEO27GApIYTU0wQ2Vei2/Ru6Wd+DssQoyN
 qr55/U/6RedD3SFNfnZwmA8UWRgHF7wyPaS0llB2ygSHYFR/Da0c+GbIsMK3xIes
 z0x8TKsaxOjqhBrOVi1PmBKANgnq7lkRQ0R2NTF+jCQPQSO4WLsQ0NB9pbCnvnz4
 XXGG8OzsdtY13uVvF7SbNxL7HjXgmPIeTp4NdZxsIob9guobc0DMcaFYUYGWtTFU
 cHJ9DlSe4wwt0/9kAD8LaBgOHXLB96LYPQlD4u/dTpTyUl8sXPY=
 =GXvM
 -----END PGP SIGNATURE-----

Merge tag 'thead-dt-for-v6.12' of https://github.com/pdp7/linux into soc/dt

T-HEAD Devicetrees for v6.12

Add SPI controller node to th1520.dtsi and enable spi0 on the BeagleV
Ahead and LicheePi 4A.

The TH1520 AP_SYS clock driver landed in v6.11 so convert multiple
peripherals like mmc and uart from fixed clocks to the clock controller.

All of these patches have been successfully tested in the latest
linux-next releases.

Signed-off-by: Drew Fustini <drew@pdp7.com>

* tag 'thead-dt-for-v6.12' of https://github.com/pdp7/linux:
  riscv: dts: thead: change TH1520 SPI node to use clock controller
  riscv: dts: thead: add clock to TH1520 gpio nodes
  riscv: dts: thead: update TH1520 dma and timer nodes to use clock controller
  riscv: dts: thead: change TH1520 mmc nodes to use clock controller
  riscv: dts: thead: change TH1520 uart nodes to use clock controller
  riscv: dts: thead: Add TH1520 AP_SUBSYS clock controller
  riscv: dts: thead: add basic spi node

Link: https://lore.kernel.org/r/ZsWs8QiVruMXjzPc@x1
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-09-03 10:24:35 +00:00
Lasse Collin
ab4ce9831a riscv: boot: add Image.xz support
The Image.* targets existed for other compressors already.  Bootloader
support is needed for decompression.

This is for CONFIG_EFI_ZBOOT=n. With CONFIG_EFI_ZBOOT=y, XZ was already
available.

Comparision with Linux 6.10 RV64GC tinyconfig (in KiB):

    1027 Image
     594 Image.gz
     541 Image.zst
     510 Image.lzma
     474 Image.xz

Link: https://lkml.kernel.org/r/20240721133633.47721-17-lasse.collin@tukaani.org
Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Jules Maselbas <jmaselbas@zdiv.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: Joel Stanley <joel@jms.id.au>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Jubin Zhong <zhongjubin@huawei.com>
Cc: Krzysztof Kozlowski <krzk@kernel.org>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rui Li <me@lirui.org>
Cc: Sam James <sam@gentoo.org>
Cc: Simon Glass <sjg@chromium.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-09-01 20:43:27 -07:00
Inochi Amaoto
585dcb21cc riscv: dts: sophgo: Add mcu device for Milk-V Pioneer
Add mcu device and thermal zones node for Milk-V Pioneer.

Tested-by: Chen Wang <unicorn_wang@outlook.com>
Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/r/IA1PR20MB4953C675C28B35723E87A36BBB822@IA1PR20MB4953.namprd20.prod.outlook.com
Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
2024-09-02 08:35:13 +08:00
Chen Wang
a508d794f8 riscv: sophgo: dts: add gpio controllers for SG2042 SoC
Add support for the GPIO controller of Sophgo SG2042.

SG2042 uses IP from Synopsys DesignWare APB GPIO and has
three GPIO controllers.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/r/20240819080851.1954691-1-unicornxw@gmail.com
Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
2024-09-02 08:35:13 +08:00
Chen Wang
014b839f79 riscv: sophgo: dts: add mmc controllers for SG2042 SoC
SG2042 has two MMC controller, one for emmc, another for sd-card.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/r/03ac9ec9c23bbe4c3b30271e76537bdbe5638665.1722847198.git.unicorn_wang@outlook.com
Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
2024-09-02 08:35:12 +08:00
Inochi Amaoto
c8eb04aecd riscv: dts: sophgo: Add i2c device support for sg2042
The i2c ip of sg2042 is a standard Synopsys i2c ip, which is already
supported by the mainline kernel.

Add i2c device node for sg2042.

Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Tested-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/r/IA1PR20MB49530E59974AF0FCA4FAB6DBBBB72@IA1PR20MB4953.namprd20.prod.outlook.com
Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
2024-09-02 08:35:12 +08:00
Inochi Amaoto
5d9e6bc82b riscv: dts: sophgo: Use common "interrupt-parent" for all peripherals for sg2042
As all peripherals of sg2042 share the same "interrupt-parent",
there is no need to use peripherals specific "interrupt-parent".
Define "interrupt-parent" in the SoC level.

Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Tested-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/r/IA1PR20MB49531F6DFD2F116207C1397DBBB72@IA1PR20MB4953.namprd20.prod.outlook.com
Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
2024-09-02 08:35:12 +08:00
Inochi Amaoto
63c33528b7 riscv: dts: sophgo: Add sdhci0 configuration for Huashan Pi
Add configuration for sdhci0 for Huashan Pi to support sd card.

Reviewed-by: Chen Wang <unicorn_wang@outlook.com>
Link: https://lore.kernel.org/r/IA1PR20MB49538AC83C5DB314D10F7186BBA92@IA1PR20MB4953.namprd20.prod.outlook.com
Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
2024-09-02 08:32:11 +08:00
Inochi Amaoto
514951a81a riscv: dts: sophgo: cv18xx: add DMA controller
Add DMA controller dt node for CV18XX/SG200x.

Link: https://lore.kernel.org/r/IA1PR20MB4953BD73E12B8A1CDBD9E1A3BB042@IA1PR20MB4953.namprd20.prod.outlook.com
Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
2024-09-02 08:32:11 +08:00
Samuel Holland
b686ecdeac
riscv: misaligned: Restrict user access to kernel memory
raw_copy_{to,from}_user() do not call access_ok(), so this code allowed
userspace to access any virtual memory address.

Cc: stable@vger.kernel.org
Fixes: 7c83232161 ("riscv: add support for misaligned trap handling in S-mode")
Fixes: 441381506b ("riscv: misaligned: remove CONFIG_RISCV_M_MODE specific code")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240815005714.1163136-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-31 17:43:38 -07:00
Palmer Dabbelt
84cfab9a18
Merge patch series "riscv: mm: Do not restrict mmap address based on hint"
Charlie Jenkins <charlie@rivosinc.com> says:

There have been a couple of reports that using the hint address to
restrict the address returned by mmap hint address has caused issues in
applications. A different solution for restricting addresses returned by
mmap is necessary to avoid breakages.

[Palmer: This also just wasn't doing the right thing in the first place,
as it didn't handle the sv39 cases we were trying to deal with.]

* b4-shazam-merge:
  riscv: mm: Do not restrict mmap address based on hint
  riscv: selftests: Remove mmap hint address checks
  Revert "RISC-V: mm: Document mmap changes"

Link: https://lore.kernel.org/r/20240826-riscv_mmap-v1-0-cd8962afe47f@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-29 06:22:51 -07:00
Charlie Jenkins
2116988d53
riscv: mm: Do not restrict mmap address based on hint
The hint address should not forcefully restrict the addresses returned
by mmap as this causes mmap to report ENOMEM when there is memory still
available.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Fixes: b5b4287acc ("riscv: mm: Use hint address in mmap if available")
Fixes: add2cc6b65 ("RISC-V: mm: Restrict address space for sv39,sv48,sv57")
Closes: https://lore.kernel.org/linux-kernel/ZbxTNjQPFKBatMq+@ghost/T/#mccb1890466bf5a488c9ce7441e57e42271895765
Link: https://lore.kernel.org/r/20240826-riscv_mmap-v1-3-cd8962afe47f@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-29 06:03:29 -07:00
Sunil V L
f8619b66bd irqchip/riscv-intc: Add ACPI support for AIA
The RINTC subtype structure in MADT also has information about other
interrupt controllers. Save this information and provide interfaces to
retrieve them when required by corresponding drivers.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://patch.msgid.link/20240812005929.113499-14-sunilvl@ventanamicro.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-27 15:48:35 +02:00
Sunil V L
e77b8dc02a ACPI: RISC-V: Initialize GSI mapping structures
RISC-V has PLIC and APLIC in MADT as well as namespace devices.
Initialize the list of those structures using MADT and namespace devices
to create mapping between the ACPI handle and the GSI ranges. This will
be used later to add dependencies.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://patch.msgid.link/20240812005929.113499-12-sunilvl@ventanamicro.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-27 15:48:35 +02:00
Sunil V L
01415e78cf ACPI: RISC-V: Implement PCI related functionality
Replace the dummy implementation for PCI related functions with actual
implementation. This needs ECAM and MCFG CONFIG options to be enabled
for RISC-V.

Signed-off-by: Sunil V L <sunilvl@ventanamicro.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://patch.msgid.link/20240812005929.113499-10-sunilvl@ventanamicro.com
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
2024-08-27 15:48:35 +02:00
Chen Wang
3ccedd259c riscv: defconfig: sophgo: enable clks for sg2042
Enable clk generators for sg2042 due to many peripherals rely on
these clocks.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-08-19 18:02:08 +01:00
Atish Patra
5aa09297a3 RISC-V: KVM: Fix to allow hpmcounter31 from the guest
The csr_fun defines a count parameter which defines the total number
CSRs emulated in KVM starting from the base. This value should be
equal to total number of counters possible for trap/emulation (32).

Fixes: a9ac6c3752 ("RISC-V: KVM: Implement trap & emulate for hpmcounters")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240816-kvm_pmu_fixes-v1-2-cdfce386dd93@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-08-19 08:58:21 +05:30
Atish Patra
7d1ffc8b08 RISC-V: KVM: Allow legacy PMU access from guest
Currently, KVM traps & emulates PMU counter access only if SBI PMU
is available as the guest can only configure/read PMU counters via
SBI only. However, if SBI PMU is not enabled in the host, the
guest will fallback to the legacy PMU which will try to access
cycle/instret and result in an illegal instruction trap which
is not desired.

KVM can allow dummy emulation of cycle/instret only for the guest
if SBI PMU is not enabled in the host. The dummy emulation will
still return zero as we don't to expose the host counter values
from a guest using legacy PMU.

Fixes: a9ac6c3752 ("RISC-V: KVM: Implement trap & emulate for hpmcounters")
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240816-kvm_pmu_fixes-v1-1-cdfce386dd93@rivosinc.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-08-19 08:58:19 +05:30
Anup Patel
47d40d9329 RISC-V: KVM: Don't zero-out PMU snapshot area before freeing data
With the latest Linux-6.11-rc3, the below NULL pointer crash is observed
when SBI PMU snapshot is enabled for the guest and the guest is forcefully
powered-off.

  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000508
  Oops [#1]
  Modules linked in: kvm
  CPU: 0 UID: 0 PID: 61 Comm: term-poll Not tainted 6.11.0-rc3-00018-g44d7178dd77a #3
  Hardware name: riscv-virtio,qemu (DT)
  epc : __kvm_write_guest_page+0x94/0xa6 [kvm]
   ra : __kvm_write_guest_page+0x54/0xa6 [kvm]
  epc : ffffffff01590e98 ra : ffffffff01590e58 sp : ffff8f80001f39b0
   gp : ffffffff81512a60 tp : ffffaf80024872c0 t0 : ffffaf800247e000
   t1 : 00000000000007e0 t2 : 0000000000000000 s0 : ffff8f80001f39f0
   s1 : 00007fff89ac4000 a0 : ffffffff015dd7e8 a1 : 0000000000000086
   a2 : 0000000000000000 a3 : ffffaf8000000000 a4 : ffffaf80024882c0
   a5 : 0000000000000000 a6 : ffffaf800328d780 a7 : 00000000000001cc
   s2 : ffffaf800197bd00 s3 : 00000000000828c4 s4 : ffffaf800248c000
   s5 : ffffaf800247d000 s6 : 0000000000001000 s7 : 0000000000001000
   s8 : 0000000000000000 s9 : 00007fff861fd500 s10: 0000000000000001
   s11: 0000000000800000 t3 : 00000000000004d3 t4 : 00000000000004d3
   t5 : ffffffff814126e0 t6 : ffffffff81412700
  status: 0000000200000120 badaddr: 0000000000000508 cause: 000000000000000d
  [<ffffffff01590e98>] __kvm_write_guest_page+0x94/0xa6 [kvm]
  [<ffffffff015943a6>] kvm_vcpu_write_guest+0x56/0x90 [kvm]
  [<ffffffff015a175c>] kvm_pmu_clear_snapshot_area+0x42/0x7e [kvm]
  [<ffffffff015a1972>] kvm_riscv_vcpu_pmu_deinit.part.0+0xe0/0x14e [kvm]
  [<ffffffff015a2ad0>] kvm_riscv_vcpu_pmu_deinit+0x1a/0x24 [kvm]
  [<ffffffff0159b344>] kvm_arch_vcpu_destroy+0x28/0x4c [kvm]
  [<ffffffff0158e420>] kvm_destroy_vcpus+0x5a/0xda [kvm]
  [<ffffffff0159930c>] kvm_arch_destroy_vm+0x14/0x28 [kvm]
  [<ffffffff01593260>] kvm_destroy_vm+0x168/0x2a0 [kvm]
  [<ffffffff015933d4>] kvm_put_kvm+0x3c/0x58 [kvm]
  [<ffffffff01593412>] kvm_vm_release+0x22/0x2e [kvm]

Clearly, the kvm_vcpu_write_guest() function is crashing because it is
being called from kvm_pmu_clear_snapshot_area() upon guest tear down.

To address the above issue, simplify the kvm_pmu_clear_snapshot_area() to
not zero-out PMU snapshot area from kvm_pmu_clear_snapshot_area() because
the guest is anyway being tore down.

The kvm_pmu_clear_snapshot_area() is also called when guest changes
PMU snapshot area of a VCPU but even in this case the previous PMU
snaphsot area must not be zeroed-out because the guest might have
reclaimed the pervious PMU snapshot area for some other purpose.

Fixes: c2f41ddbcd ("RISC-V: KVM: Implement SBI PMU Snapshot feature")
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Link: https://lore.kernel.org/r/20240815170907.2792229-1-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-08-19 08:58:17 +05:30
Andrew Jones
6b7b282e6b RISC-V: KVM: Fix sbiret init before forwarding to userspace
When forwarding SBI calls to userspace ensure sbiret.error is
initialized to SBI_ERR_NOT_SUPPORTED first, in case userspace
neglects to set it to anything. If userspace neglects it then we
can't be sure it did anything else either, so we just report it
didn't do or try anything. Just init sbiret.value to zero, which is
the preferred value to return when nothing special is specified.

KVM was already initializing both sbiret.error and sbiret.value, but
the values used appear to come from a copy+paste of the __sbi_ecall()
implementation, i.e. a0 and a1, which don't apply prior to the call
being executed, nor at all when forwarding to userspace.

Fixes: dea8ee31a0 ("RISC-V: KVM: Add SBI v0.1 support")
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240807154943.150540-2-ajones@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-08-19 08:32:10 +05:30
Palmer Dabbelt
32d5f7add0
Merge patch series "RISC-V: hwprobe: Misaligned scalar perf fix and rename"
Evan Green <evan@rivosinc.com> says:

The CPUPERF0 hwprobe key was documented and identified in code as
a bitmask value, but its contents were an enum. This produced
incorrect behavior in conjunction with the WHICH_CPUS hwprobe flag.
The first patch in this series fixes the bitmask/enum problem by
creating a new hwprobe key that returns the same data, but is
properly described as a value instead of a bitmask. The second patch
renames the value definitions in preparation for adding vector misaligned
access info. As of this version, the old defines are kept in place to
maintain source compatibility with older userspace programs.

* b4-shazam-merge:
  RISC-V: hwprobe: Add SCALAR to misaligned perf defines
  RISC-V: hwprobe: Add MISALIGNED_PERF key

Link: https://lore.kernel.org/r/20240809214444.3257596-1-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-15 13:12:21 -07:00
Alexandre Ghiti
e01d48c699
riscv: Fix out-of-bounds when accessing Andes per hart vendor extension array
The out-of-bounds access is reported by UBSAN:

[    0.000000] UBSAN: array-index-out-of-bounds in ../arch/riscv/kernel/vendor_extensions.c:41:66
[    0.000000] index -1 is out of range for type 'riscv_isavendorinfo [32]'
[    0.000000] CPU: 0 UID: 0 PID: 0 Comm: swapper Not tainted 6.11.0-rc2ubuntu-defconfig #2
[    0.000000] Hardware name: riscv-virtio,qemu (DT)
[    0.000000] Call Trace:
[    0.000000] [<ffffffff94e078ba>] dump_backtrace+0x32/0x40
[    0.000000] [<ffffffff95c83c1a>] show_stack+0x38/0x44
[    0.000000] [<ffffffff95c94614>] dump_stack_lvl+0x70/0x9c
[    0.000000] [<ffffffff95c94658>] dump_stack+0x18/0x20
[    0.000000] [<ffffffff95c8bbb2>] ubsan_epilogue+0x10/0x46
[    0.000000] [<ffffffff95485a82>] __ubsan_handle_out_of_bounds+0x94/0x9c
[    0.000000] [<ffffffff94e09442>] __riscv_isa_vendor_extension_available+0x90/0x92
[    0.000000] [<ffffffff94e043b6>] riscv_cpufeature_patch_func+0xc4/0x148
[    0.000000] [<ffffffff94e035f8>] _apply_alternatives+0x42/0x50
[    0.000000] [<ffffffff95e04196>] apply_boot_alternatives+0x3c/0x100
[    0.000000] [<ffffffff95e05b52>] setup_arch+0x85a/0x8bc
[    0.000000] [<ffffffff95e00ca0>] start_kernel+0xa4/0xfb6

The dereferencing using cpu should actually not happen, so remove it.

Fixes: 23c996fc2b ("riscv: Extend cpufeature.c to detect vendor extensions")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240814192619.276794-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-15 13:12:16 -07:00
Jinjie Ruan
0e3f3649d4
riscv: Enable generic CPU vulnerabilites support
Currently x86, ARM and ARM64 support generic CPU vulnerabilites, but
RISC-V not, such as:

	# cd /sys/devices/system/cpu/vulnerabilities/
x86:
	# cat spec_store_bypass
		Mitigation: Speculative Store Bypass disabled via prctl and seccomp
	# cat meltdown
		Not affected

ARM64:

	# cat spec_store_bypass
		Mitigation: Speculative Store Bypass disabled via prctl and seccomp
	# cat meltdown
		Mitigation: PTI

RISC-V:

	# cat /sys/devices/system/cpu/vulnerabilities
	# ... No such file or directory

As SiFive RISC-V Core IP offerings are not affected by Meltdown and
Spectre, it can use the default weak function as below:

	# cat spec_store_bypass
		Not affected
	# cat meltdown
		Not affected

Link: https://www.sifive.cn/blog/sifive-statement-on-meltdown-and-spectre

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Link: https://lore.kernel.org/r/20240703022732.2068316-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 17:44:34 -07:00
Ying Sun
c6ebf2c528
riscv/kexec_file: Fix relocation type R_RISCV_ADD16 and R_RISCV_SUB16 unknown
Runs on the kernel with CONFIG_RISCV_ALTERNATIVE enabled:
  kexec -sl vmlinux

Error:
  kexec_image: Unknown rela relocation: 34
  kexec_image: Error loading purgatory ret=-8
and
  kexec_image: Unknown rela relocation: 38
  kexec_image: Error loading purgatory ret=-8

The purgatory code uses the 16-bit addition and subtraction relocation
type, but not handled, resulting in kexec_file_load failure.
So add handle to arch_kexec_apply_relocations_add().

Tested on RISC-V64 Qemu-virt, issue fixed.

Co-developed-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Petr Tesarik <petr@tesarici.cz>
Signed-off-by: Ying Sun <sunying@isrc.iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240711083236.2859632-1-sunying@isrc.iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 17:44:33 -07:00
Evan Green
1f5288874d
RISC-V: hwprobe: Add SCALAR to misaligned perf defines
In preparation for misaligned vector performance hwprobe keys, rename
the hwprobe key values associated with misaligned scalar accesses to
include the term SCALAR. Leave the old defines in place to maintain
source compatibility.

This change is intended to be a functional no-op.

Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240809214444.3257596-3-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:13:24 -07:00
Evan Green
c42e2f0767
RISC-V: hwprobe: Add MISALIGNED_PERF key
RISCV_HWPROBE_KEY_CPUPERF_0 was mistakenly flagged as a bitmask in
hwprobe_key_is_bitmask(), when in reality it was an enum value. This
causes problems when used in conjunction with RISCV_HWPROBE_WHICH_CPUS,
since SLOW, FAST, and EMULATED have values whose bits overlap with
each other. If the caller asked for the set of CPUs that was SLOW or
EMULATED, the returned set would also include CPUs that were FAST.

Introduce a new hwprobe key, RISCV_HWPROBE_KEY_MISALIGNED_PERF, which
returns the same values in response to a direct query (with no flags),
but is properly handled as an enumerated value. As a result, SLOW,
FAST, and EMULATED are all correctly treated as distinct values under
the new key when queried with the WHICH_CPUS flag.

Leave the old key in place to avoid disturbing applications which may
have already come to rely on the key, with or without its broken
behavior with respect to the WHICH_CPUS flag.

Fixes: e178bf146e ("RISC-V: hwprobe: Introduce which-cpus flag")
Signed-off-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240809214444.3257596-2-evan@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:13:23 -07:00
Haibo Xu
a445699879
RISC-V: ACPI: NUMA: initialize all values of acpi_early_node_map to NUMA_NO_NODE
Currently, only acpi_early_node_map[0] was initialized to NUMA_NO_NODE.
To ensure all the values were properly initialized, switch to initialize
all of them to NUMA_NO_NODE.

Fixes: eabd9db64e ("ACPI: RISCV: Add NUMA support based on SRAT and SLIT")
Reported-by: Andrew Jones <ajones@ventanamicro.com>
Suggested-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/0d362a8ae50558b95685da4c821b2ae9e8cf78be.1722828421.git.haibo1.xu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:12:41 -07:00
Nam Cao
57d76bc51f
riscv: change XIP's kernel_map.size to be size of the entire kernel
With XIP kernel, kernel_map.size is set to be only the size of data part of
the kernel. This is inconsistent with "normal" kernel, who sets it to be
the size of the entire kernel.

More importantly, XIP kernel fails to boot if CONFIG_DEBUG_VIRTUAL is
enabled, because there are checks on virtual addresses with the assumption
that kernel_map.size is the size of the entire kernel (these checks are in
arch/riscv/mm/physaddr.c).

Change XIP's kernel_map.size to be the size of the entire kernel.

Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: <stable@vger.kernel.org> # v6.1+
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240508191917.2892064-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:12:33 -07:00
Celeste Liu
6111939463
riscv: entry: always initialize regs->a0 to -ENOSYS
Otherwise when the tracer changes syscall number to -1, the kernel fails
to initialize a0 with -ENOSYS and subsequently fails to return the error
code of the failed syscall to userspace. For example, it will break
strace syscall tampering.

Fixes: 52449c17bd ("riscv: entry: set a0 = -ENOSYS only when syscall != -1")
Reported-by: "Dmitry V. Levin" <ldv@strace.io>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Celeste Liu <CoelacanthusHex@gmail.com>
Link: https://lore.kernel.org/r/20240627142338.5114-2-CoelacanthusHex@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-14 13:12:22 -07:00
Drew Fustini
2d98fea749 riscv: dts: thead: change TH1520 SPI node to use clock controller
Change the clock property in the TH1520 SPI controller node to a clock
provided by AP_SYS clock controller.

Remove spi_clk fixed clock reference from BeagleV Ahead and LPI4a dts.

Link: https://git.beagleboard.org/beaglev-ahead/beaglev-ahead/-/tree/main/docs
Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-08-08 09:19:46 -07:00
Drew Fustini
7f5b28218c riscv: dts: thead: add clock to TH1520 gpio nodes
Add clock property to TH1520 gpio controller nodes. These clock gates
refer to corresponding enable bits in the peripheral clock gate control
register. Refer to register PERI_CLK_CFG in section 4.4.2.2.52 of the
TH1520 System User Manual.

Link: https://git.beagleboard.org/beaglev-ahead/beaglev-ahead/-/tree/main/docs
Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-08-08 09:19:46 -07:00
Drew Fustini
89d58327fd riscv: dts: thead: update TH1520 dma and timer nodes to use clock controller
Change the dma-controller and timer nodes to use the APB clock provided
by the AP_SUBSYS clock controller.

Remove apb_clk reference from BeagleV Ahead and LPI4a dts.

Link: https://git.beagleboard.org/beaglev-ahead/beaglev-ahead/-/tree/main/docs
Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-08-08 09:19:46 -07:00
Drew Fustini
03a20182e1 riscv: dts: thead: change TH1520 mmc nodes to use clock controller
Change the clock property in the TH1520 mmc controller nodes to a clock
provided by AP_SYS clock controller.

Remove sdhci fixed clock reference from BeagleV Ahead and LPI4a dts.

Link: https://git.beagleboard.org/beaglev-ahead/beaglev-ahead/-/tree/main/docs
Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-08-08 09:19:45 -07:00
Drew Fustini
c101b4a028 riscv: dts: thead: change TH1520 uart nodes to use clock controller
Change the clock property in TH1520 uart nodes to a clock provided by
AP_SUBSYS clock controller.

Link: https://git.beagleboard.org/beaglev-ahead/beaglev-ahead/-/tree/main/docs
Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-08-08 09:19:45 -07:00
Drew Fustini
e919fe036a riscv: dts: thead: Add TH1520 AP_SUBSYS clock controller
Add node for the AP_SUBSYS clock controller on the T-Head TH1520 SoC.

Link: https://openbeagle.org/beaglev-ahead/beaglev-ahead/-/blob/main/docs/TH1520%20System%20User%20Manual.pdf
Link: https://git.beagleboard.org/beaglev-ahead/beaglev-ahead/-/tree/main/docs
Signed-off-by: Drew Fustini <dfustini@tenstorrent.com>
2024-08-08 09:19:45 -07:00
Ryo Takakura
f15c21a3de
RISC-V: Enable IPI CPU Backtrace
Add arch_trigger_cpumask_backtrace() which is a generic infrastructure
for sampling other CPUs' backtrace using IPI.

The feature is used when lockups are detected or in case of oops/panic
if parameters are set accordingly.

Below is the case of oops with the oops_all_cpu_backtrace enabled.

$ sysctl kernel.oops_all_cpu_backtrace=1

triggering oops shows:
[  212.214237] NMI backtrace for cpu 1
[  212.214390] CPU: 1 PID: 610 Comm: in:imklog Tainted: G           OE      6.10.0-rc6 #1
[  212.214570] Hardware name: riscv-virtio,qemu (DT)
[  212.214690] epc : fallback_scalar_usercopy+0x8/0xdc
[  212.214809]  ra : _copy_to_user+0x20/0x40
[  212.214913] epc : ffffffff80c3a930 ra : ffffffff8059ba7e sp : ff20000000eabb50
[  212.215061]  gp : ffffffff82066f90 tp : ff6000008e958000 t0 : 3463303866660000
[  212.215210]  t1 : 000000000000005b t2 : 3463303866666666 s0 : ff20000000eabb60
[  212.215358]  s1 : 0000000000000386 a0 : 00007ff6e81df926 a1 : ff600000824df800
[  212.215505]  a2 : 000000000000003f a3 : 7fffffffffffffc0 a4 : 0000000000000000
[  212.215651]  a5 : 000000000000003f a6 : 0000000000000000 a7 : 0000000000000000
[  212.215857]  s2 : ff600000824df800 s3 : ffffffff82066cc0 s4 : 0000000000001c1a
[  212.216074]  s5 : ffffffff8206a5a8 s6 : 00007ff6e81df926 s7 : ffffffff8206a5a0
[  212.216278]  s8 : ff600000824df800 s9 : ffffffff81e25de0 s10: 000000000000003f
[  212.216471]  s11: ffffffff8206a59d t3 : ff600000824df812 t4 : ff600000824df812
[  212.216651]  t5 : ff600000824df818 t6 : 0000000000040000
[  212.216796] status: 0000000000040120 badaddr: 0000000000000000 cause: 8000000000000001
[  212.217035] [<ffffffff80c3a930>] fallback_scalar_usercopy+0x8/0xdc
[  212.217207] [<ffffffff80095f56>] syslog_print+0x1f4/0x2b2
[  212.217362] [<ffffffff80096e5c>] do_syslog.part.0+0x94/0x2d8
[  212.217502] [<ffffffff800979e8>] do_syslog+0x66/0x88
[  212.217636] [<ffffffff803a5dda>] kmsg_read+0x44/0x5c
[  212.217764] [<ffffffff80392dbe>] proc_reg_read+0x7a/0xa8
[  212.217952] [<ffffffff802ff726>] vfs_read+0xb0/0x24e
[  212.218090] [<ffffffff803001ba>] ksys_read+0x64/0xe4
[  212.218264] [<ffffffff8030025a>] __riscv_sys_read+0x20/0x2c
[  212.218453] [<ffffffff80c4af9a>] do_trap_ecall_u+0x60/0x1d4
[  212.218664] [<ffffffff80c56998>] ret_from_exception+0x0/0x64

Signed-off-by: Ryo Takakura <takakura@valinux.co.jp>
Link: https://lore.kernel.org/r/20240718093659.158912-1-takakura@valinux.co.jp
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-07 07:11:59 -07:00
Alexandre Ghiti
ee9a68394b
riscv: Re-introduce global icache flush in patch_text_XXX()
commit edf2d546bf ("riscv: patch: Flush the icache right after
patching to avoid illegal insns") mistakenly removed the global icache
flush in patch_text_nosync() and patch_text_set_nosync() functions, so
reintroduce them.

Fixes: edf2d546bf ("riscv: patch: Flush the icache right after patching to avoid illegal insns")
Reported-by: Samuel Holland <samuel.holland@sifive.com>
Closes: https://lore.kernel.org/linux-riscv/a28ddc26-d77a-470a-a33f-88144f717e86@sifive.com/
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240801191404.55181-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-06 06:49:14 -07:00
Palmer Dabbelt
7c08a2615f
Merge patch series "RISC-V: Parse DT for Zkr to seed KASLR"
Jesse Taube <jesse@rivosinc.com> says:

Add functions to pi/fdt_early.c to help parse the FDT to check if
the isa string has the Zkr extension. Then use the Zkr extension to
seed the KASLR base address.

The first two patches fix the visibility of symbols.

* b4-shazam-merge:
  RISC-V: Use Zkr to seed KASLR base address
  RISC-V: pi: Add kernel/pi/pi.h
  RISC-V: lib: Add pi aliases for string functions
  RISC-V: pi: Force hidden visibility for all symbol references

Link: https://lore.kernel.org/r/20240709173937.510084-1-jesse@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-05 12:06:43 -07:00
Jesse Taube
945302df3d
RISC-V: Use Zkr to seed KASLR base address
Parse the device tree for Zkr in the isa string.
If Zkr is present, use it to seed the kernel base address.

On an ACPI system, as of this commit, there is no easy way to check if
Zkr is present. Blindly running the instruction isn't an option as;
we have to be able to trust the firmware.

Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Zong Li <zong.li@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240709173937.510084-5-jesse@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-05 12:06:41 -07:00
Jesse Taube
b331182715
RISC-V: pi: Add kernel/pi/pi.h
Add pi.h header for declarations of the kernel/pi prefixed functions
and any other related declarations.

Suggested-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240709173937.510084-4-jesse@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-05 12:06:40 -07:00
Jesse Taube
d57e19fcbf
RISC-V: lib: Add pi aliases for string functions
memset, strcmp, and strncmp are all used in the __pi_ section,
add SYM_FUNC_ALIAS for them.

When KASAN is enabled in <asm/string.h> __pi___memset is also needed.

Suggested-by: Charlie Jenkins <charlie@rivosinc.com>
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240709173937.510084-3-jesse@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-05 12:06:39 -07:00
Jesse Taube
14c3ec6723
RISC-V: pi: Force hidden visibility for all symbol references
Eliminate all GOT entries in the .pi section, by forcing hidden
visibility for all symbol references, which informs the compiler that
such references will be resolved at link time without the need for
allocating GOT entries.

Include linux/hidden.h in Makefile, like arm64, for the
hidden visibility attribute.

Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240709173937.510084-2-jesse@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-05 12:06:38 -07:00
Linus Torvalds
948752d2e0 RISC-V Fixes for 6.11-rc2
* A fix to avoid dropping some of the internal pseudo-extensions, which
   breaks *envcfg dependency parsing.
 * The kernel entry address is now aligned in purgatory, which avoids a
   misaligned load that can lead to crash on systems that don't support
   misaligned accesses early in boot.
 * The FW_SFENCE_VMA_RECEIVED perf event was duplicated in a handful of
   perf JSON configurations, one of them been updated to
   FW_SFENCE_VMA_ASID_SENT.
 * The starfive cache driver is now restricted to 64-bit systems, as it
   isn't 32-bit clean.
 * A fix for to avoid aliasing legacy-mode perf counters with software
   perf counters.
 * VM_FAULT_SIGSEGV is now handled in the page fault code.
 * A fix for stalls during CPU hotplug due to IPIs being disabled.
 * A fix for memblock bounds checking.  This manifests as a crash on
   systems with discontinuous memory maps that have regions that don't
   fit in the linear map.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmas/qwTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiWp7EACDcorcihBG8uSsX//GKJPjkiGIbZkT
 MIMN3yqIzJuSftxpvgVxpyq2MFKYy7BK/75sK+4VoQpoCJEtdxbdh0JUqck/Nrgj
 Kn0hxWy7RO6Rp9ggf9dTdca64Tdxh32Eegpum3E46zuhYQBMcNze4z4NsOXs6ems
 254ww8+v7V5R7FGsxm1PG4Hs3soxZ9FPdWE69ndxmjr9N5FFkchk5YbV8AgKYtSJ
 sfu5Q+68zh58GVZhn0usug0fHNgVzdvwy3PIBDGD58hqIDAs9WlF80MiW3sESTIe
 PrJcAFBU4tHp+8h+OMaKw2xfybrZpNmqobx7dED34PJu0R4+Uvz7MUKMMPUJeB+q
 7UOZokjF2Hvd5VsAeTc1PisvzVsWkWpkzJqZmdaTr2m8J4m5z7/nby+ZcXmoOlVz
 JiMDgrkM4KIziq++9bYbBfcxsS9dMsvNtEQAHByL/zdVfAFTvWUMUmAgg27C3K9Z
 QbHfbpxqQ/pEu4CsRUIx4GnkEKnWPLuGovnYboGmC3BCDwQkkV8H0tcEhJtWMKte
 6h+vvKBX2POS4l8467ElmcTRv5Cfpi/dmhZrC9SHHQhNF5OiHHM2CmSEOKS1bUPj
 e4+k/QGmVQOAJGRRPkpD+DFMhHT/jhvbYV4kDXr/h9AKJQ2eWRGMSOMaPJ/X311N
 R5W1yiJilIhXuQ==
 =K52W
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix to avoid dropping some of the internal pseudo-extensions, which
   breaks *envcfg dependency parsing

 - The kernel entry address is now aligned in purgatory, which avoids a
   misaligned load that can lead to crash on systems that don't support
   misaligned accesses early in boot

 - The FW_SFENCE_VMA_RECEIVED perf event was duplicated in a handful of
   perf JSON configurations, one of them been updated to
   FW_SFENCE_VMA_ASID_SENT

 - The starfive cache driver is now restricted to 64-bit systems, as it
   isn't 32-bit clean

 - A fix for to avoid aliasing legacy-mode perf counters with software
   perf counters

 - VM_FAULT_SIGSEGV is now handled in the page fault code

 - A fix for stalls during CPU hotplug due to IPIs being disabled

 - A fix for memblock bounds checking. This manifests as a crash on
   systems with discontinuous memory maps that have regions that don't
   fit in the linear map

* tag 'riscv-for-linus-6.11-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix linear mapping checks for non-contiguous memory regions
  RISC-V: Enable the IPI before workqueue_online_cpu()
  riscv/mm: Add handling for VM_FAULT_SIGSEGV in mm_fault_error()
  perf: riscv: Fix selecting counters in legacy mode
  cache: StarFive: Require a 64-bit system
  perf arch events: Fix duplicate RISC-V SBI firmware event name
  riscv/purgatory: align riscv_kernel_entry
  riscv: cpufeature: Do not drop Linux-internal extensions
2024-08-02 09:33:35 -07:00
Arnd Bergmann
343416f0c1 syscalls: fix syscall macros for newfstat/newfstatat
The __NR_newfstat and __NR_newfstatat macros accidentally got renamed
in the conversion to the syscall.tbl format, dropping the 'new' portion
of the name.

In an unrelated change, the two syscalls are no longer architecture
specific but are once more defined on all 64-bit architectures, so the
'newstat' ABI keyword can be dropped from the table as a simplification.

Fixes: Fixes: 4fe53bf2ba ("syscalls: add generic scripts/syscall.tbl")
Closes: https://lore.kernel.org/lkml/838053e0-b186-4e9f-9668-9a3384a71f23@app.fastmail.com/T/#t
Reported-by: Florian Weimer <fweimer@redhat.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-08-02 15:20:47 +02:00
Stuart Menefy
3b6564427a
riscv: Fix linear mapping checks for non-contiguous memory regions
The RISC-V kernel already has checks to ensure that memory which would
lie outside of the linear mapping is not used. However those checks
use memory_limit, which is used to implement the mem= kernel command
line option (to limit the total amount of memory, not its address
range). When memory is made up of two or more non-contiguous memory
banks this check is incorrect.

Two changes are made here:
 - add a call in setup_bootmem() to memblock_cap_memory_range() which
   will cause any memory which falls outside the linear mapping to be
   removed from the memory regions.
 - remove the check in create_linear_mapping_page_table() which was
   intended to remove memory which is outside the liner mapping based
   on memory_limit, as it is no longer needed. Note a check for
   mapping more memory than memory_limit (to implement mem=) is
   unnecessary because of the existing call to
   memblock_enforce_memory_limit().

This issue was seen when booting on a SV39 platform with two memory
banks:
  0x00,80000000 1GiB
  0x20,00000000 32GiB
This memory range is 158GiB from top to bottom, but the linear mapping
is limited to 128GiB, so the lower block of RAM will be mapped at
PAGE_OFFSET, and the upper block straddles the top of the linear
mapping.

This causes the following Oops:
[    0.000000] Linux version 6.10.0-rc2-gd3b8dd5b51dd-dirty (stuart.menefy@codasip.com) (riscv64-codasip-linux-gcc (GCC) 13.2.0, GNU ld (GNU Binutils) 2.41.0.20231213) #20 SMP Sat Jun 22 11:34:22 BST 2024
[    0.000000] memblock_add: [0x0000000080000000-0x00000000bfffffff] early_init_dt_add_memory_arch+0x4a/0x52
[    0.000000] memblock_add: [0x0000002000000000-0x00000027ffffffff] early_init_dt_add_memory_arch+0x4a/0x52
...
[    0.000000] memblock_alloc_try_nid: 23724 bytes align=0x8 nid=-1 from=0x0000000000000000 max_addr=0x0000000000000000 early_init_dt_alloc_memory_arch+0x1e/0x48
[    0.000000] memblock_reserve: [0x00000027ffff5350-0x00000027ffffaffb] memblock_alloc_range_nid+0xb8/0x132
[    0.000000] Unable to handle kernel paging request at virtual address fffffffe7fff5350
[    0.000000] Oops [#1]
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 6.10.0-rc2-gd3b8dd5b51dd-dirty #20
[    0.000000] Hardware name: codasip,a70x (DT)
[    0.000000] epc : __memset+0x8c/0x104
[    0.000000]  ra : memblock_alloc_try_nid+0x74/0x84
[    0.000000] epc : ffffffff805e88c8 ra : ffffffff806148f6 sp : ffffffff80e03d50
[    0.000000]  gp : ffffffff80ec4158 tp : ffffffff80e0bec0 t0 : fffffffe7fff52f8
[    0.000000]  t1 : 00000027ffffb000 t2 : 5f6b636f6c626d65 s0 : ffffffff80e03d90
[    0.000000]  s1 : 0000000000005cac a0 : fffffffe7fff5350 a1 : 0000000000000000
[    0.000000]  a2 : 0000000000005cac a3 : fffffffe7fffaff8 a4 : 000000000000002c
[    0.000000]  a5 : ffffffff805e88c8 a6 : 0000000000005cac a7 : 0000000000000030
[    0.000000]  s2 : fffffffe7fff5350 s3 : ffffffffffffffff s4 : 0000000000000000
[    0.000000]  s5 : ffffffff8062347e s6 : 0000000000000000 s7 : 0000000000000001
[    0.000000]  s8 : 0000000000002000 s9 : 00000000800226d0 s10: 0000000000000000
[    0.000000]  s11: 0000000000000000 t3 : ffffffff8080a928 t4 : ffffffff8080a928
[    0.000000]  t5 : ffffffff8080a928 t6 : ffffffff8080a940
[    0.000000] status: 0000000200000100 badaddr: fffffffe7fff5350 cause: 000000000000000f
[    0.000000] [<ffffffff805e88c8>] __memset+0x8c/0x104
[    0.000000] [<ffffffff8062349c>] early_init_dt_alloc_memory_arch+0x1e/0x48
[    0.000000] [<ffffffff8043e892>] __unflatten_device_tree+0x52/0x114
[    0.000000] [<ffffffff8062441e>] unflatten_device_tree+0x9e/0xb8
[    0.000000] [<ffffffff806046fe>] setup_arch+0xd4/0x5bc
[    0.000000] [<ffffffff806007aa>] start_kernel+0x76/0x81a
[    0.000000] Code: b823 02b2 bc23 02b2 b023 04b2 b423 04b2 b823 04b2 (bc23) 04b2
[    0.000000] ---[ end trace 0000000000000000 ]---
[    0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[    0.000000] ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

The problem is that memblock (unaware that some physical memory cannot
be used) has allocated memory from the top of memory but which is
outside the linear mapping region.

Signed-off-by: Stuart Menefy <stuart.menefy@codasip.com>
Fixes: c99127c452 ("riscv: Make sure the linear mapping does not use the kernel mapping")
Reviewed-by: David McKay <david.mckay@codasip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240622114217.2158495-1-stuart.menefy@codasip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-01 11:46:09 -07:00
Nick Hu
3908ba2e0b
RISC-V: Enable the IPI before workqueue_online_cpu()
Sometimes the hotplug cpu stalls at the arch_cpu_idle() for a while after
workqueue_online_cpu(). When cpu stalls at the idle loop, the reschedule
IPI is pending. However the enable bit is not enabled yet so the cpu stalls
at WFI until watchdog timeout. Therefore enable the IPI before the
workqueue_online_cpu() to fix the issue.

Fixes: 63c5484e74 ("workqueue: Add multiple affinity scopes and interface to select them")
Signed-off-by: Nick Hu <nick.hu@sifive.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240717031714.1946036-1-nick.hu@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-01 07:15:43 -07:00
Zhe Qiao
0c710050c4
riscv/mm: Add handling for VM_FAULT_SIGSEGV in mm_fault_error()
Handle VM_FAULT_SIGSEGV in the page fault path so that we correctly
kill the process and we don't BUG() the kernel.

Fixes: 07037db5d4 ("RISC-V: Paging and MMU")
Signed-off-by: Zhe Qiao <qiaozhe@iscas.ac.cn>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240731084547.85380-1-qiaozhe@iscas.ac.cn
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-01 07:15:27 -07:00
Daniel Maslowski
fb197c5d2f
riscv/purgatory: align riscv_kernel_entry
When alignment handling is delegated to the kernel, everything must be
word-aligned in purgatory, since the trap handler is then set to the
kexec one. Without the alignment, hitting the exception would
ultimately crash. On other occasions, the kernel's handler would take
care of exceptions.
This has been tested on a JH7110 SoC with oreboot and its SBI delegating
unaligned access exceptions and the kernel configured to handle them.

Fixes: 736e30af58 ("RISC-V: Add purgatory")
Signed-off-by: Daniel Maslowski <cyrevolt@gmail.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240719170437.247457-1-cyrevolt@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-08-01 07:14:34 -07:00
Kanak Shilledar
32121e1584 riscv: dts: thead: add basic spi node
created spi0 node with fixed clock. the spi0 node
uses synopsis designware driver and has the following
compatible "snps,dw-apb-ssi". the spi0 node is connected
to a SPI NOR flash pad which is left unpopulated on the back
side of the board.

Acked-by: Drew Fustini <drew@pdp7.com>
Signed-off-by: Kanak Shilledar <kanakshilledar@gmail.com>
Signed-off-by: Drew Fustini <drew@pdp7.com>
2024-07-31 17:27:00 -07:00
Samuel Holland
b75a22e7d4
riscv: cpufeature: Do not drop Linux-internal extensions
The Linux-internal Xlinuxenvcfg ISA extension is omitted from the
riscv_isa_ext array because it has no DT binding and should not appear
in /proc/cpuinfo. The logic added in commit 625034abd5 ("riscv: add
ISA extensions validation callback") assumes all extensions are included
in riscv_isa_ext, and so riscv_resolve_isa() wrongly drops Xlinuxenvcfg
from the final ISA string. Instead, accept such Linux-internal ISA
extensions as if they have no validation callback.

Fixes: 625034abd5 ("riscv: add ISA extensions validation callback")
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240718213011.2600150-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-31 09:53:13 -07:00
Linus Torvalds
c9f33436d8 RISC-V Patches for the 6.11 Merge Window, Part 2
* Support for NUMA (via SRAT and SLIT), console output (via SPCR), and
   cache info (via PPTT) on ACPI-based systems.
 * The trap entry/exit code no longer breaks the return address stack
   predictor on many systems, which results in an improvement to trap
   latency.
 * Support for HAVE_ARCH_STACKLEAK.
 * The sv39 linear map has been extended to support 128GiB mappings.
 * The frequency of the mtime CSR is now visible via hwprobe.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmaj2EYTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiVG3D/9kNHTI09iPDJd6fTChE3cpMxy7xXXE
 URX3Avu+gYsJmIbYyg4RnQ8FGFN7icKBCrQqs7JmLliU0NU+YMcCcjsJA2QaivbD
 VAlaex1qNcvNGteHrpbqhr3Zs4zw8GlBkB3KFTLyPAp61bybGo0a/A5ONJ7ScQIW
 RWHewAPgb86cQ0Q34JpO87TqvMM0KMvhQP5dip+olaFjLRBzhXmGFZfHqA80kTWl
 0ytYclVCHZMtO/5mnQpuIOVs1IKw9L4wa0sivOQF0iLTqfKDFALa6yZsThHA/w3e
 JVuBAdQhcPZ3fgO2fUfJPlW16GmRC2/tdiFg5NFw8k4vo7DYBwX55ztPKXqDrJDM
 8ah85IeLiPar/A/uHdn6bPjK+aGMuzklKF50r62XXAc2fL8mza1sdvKCVOy2EOLn
 JyGI9c/10KpvN/DW8g7hPefhvbx4+tCKkFcPqf++VQha6W8cQdCKi+Li0Pm8TTnp
 XPQjIvSlDDG1Pl4ofgBSFoyB8pkBXNzvv8NZp+YYtnqSOLAKaZuP+KwA8TwHdvGM
 pdCXcL3KHiLy4/pJWEoNTutD0mbJ7PUIb2P/KkjqYDgp4F1n0Hg+/aeSIp+7a4Pv
 yTBctIGxrlriQMIdtWCR8tyhcPP4pDpGYkW0K15EE16G0NK0fjD89LEXYqT6ae2R
 C0QgiwnVe/eopg==
 =zeUn
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.11-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - Support for NUMA (via SRAT and SLIT), console output (via SPCR), and
   cache info (via PPTT) on ACPI-based systems.

 - The trap entry/exit code no longer breaks the return address stack
   predictor on many systems, which results in an improvement to trap
   latency.

 - Support for HAVE_ARCH_STACKLEAK.

 - The sv39 linear map has been extended to support 128GiB mappings.

 - The frequency of the mtime CSR is now visible via hwprobe.

* tag 'riscv-for-linus-6.11-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (21 commits)
  RISC-V: Provide the frequency of time CSR via hwprobe
  riscv: Extend sv39 linear mapping max size to 128G
  riscv: enable HAVE_ARCH_STACKLEAK
  riscv: signal: Remove unlikely() from WARN_ON() condition
  riscv: Improve exception and system call latency
  RISC-V: Select ACPI PPTT drivers
  riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT
  riscv: cacheinfo: remove the useless input parameter (node) of ci_leaf_init()
  RISC-V: ACPI: Enable SPCR table for console output on RISC-V
  riscv: boot: remove duplicated targets line
  trace: riscv: Remove deprecated kprobe on ftrace support
  riscv: cpufeature: Extract common elements from extension checking
  riscv: Introduce vendor variants of extension helpers
  riscv: Add vendor extensions to /proc/cpuinfo
  riscv: Extend cpufeature.c to detect vendor extensions
  RISC-V: run savedefconfig for defconfig
  RISC-V: hwprobe: sort EXT_KEY()s in hwprobe_isa_ext0() alphabetically
  ACPI: NUMA: replace pr_info with pr_debug in arch_acpi_numa_init
  ACPI: NUMA: change the ACPI_NUMA to a hidden option
  ACPI: NUMA: Add handler for SRAT RINTC affinity structure
  ...
2024-07-27 10:14:34 -07:00
Linus Torvalds
51c4767503 bitmap-6.11-rc1
Random fixes for v6.11.
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEi8GdvG6xMhdgpu/4sUSA/TofvsgFAmahKbIACgkQsUSA/Tof
 vsh8zQwAvguyeNubDFqdMe3E/Vp1J3WqXsBFzbE1rGLCyI2S0cgJFL5BlW51zY47
 70wLt9EmroEobwj1qHSQlzejNp31kSBQ1Sqq25oivfJqEF1elDT5PQxYqBbU1C9Y
 kVWnxtb+oKaoFd5jiBK8+iTl8dXjT6H2RoV0zpPab/JPcqsjwFfkUvtENt/Kpo5c
 aRrGTFwshdp5eT4sEZQv57VKroBcwZOvv2//qrklFHrJHl4pjMT8eaX3twcQysoy
 umTVt+TK6NErLnht+VRQJ2/L02FKi7b+bHePVgNzaT+1FSDMT4FltmZd96Xwbzah
 hSkwWtqy0N2gaTcqie9nwdEiCJGjF39M7k2wangUS91CeDsbIUSsJgDCESUCm+zK
 hRqleGOnoeg4+jZBci7M53lKa/pADlmLhnU8iAc3BSKozsaioltkT+hHn8vAkstk
 h/kHlbfkzasufUWAhduBpIn384gWWEY6RACffgCsOuvbT+ZyDKUJpKYaEwVx+Pri
 l72j0hs9
 =RbET
 -----END PGP SIGNATURE-----

Merge tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux

Pull bitmap updates from Yury Norov:
 "Random fixes"

* tag 'bitmap-6.11-rc1' of https://github.com:/norov/linux:
  riscv: Remove unnecessary int cast in variable_fls()
  radix tree test suite: put definition of bitmap_clear() into lib/bitmap.c
  bitops: Add a comment explaining the double underscore macros
  lib: bitmap: add missing MODULE_DESCRIPTION() macros
  cpumask: introduce assign_cpu() macro
2024-07-26 09:50:36 -07:00
Palmer Dabbelt
52420e483d
RISC-V: Provide the frequency of time CSR via hwprobe
The RISC-V architecture makes a real time counter CSR (via RDTIME
instruction) available for applications in U-mode but there is no
architected mechanism for an application to discover the frequency
the counter is running at. Some applications (e.g., DPDK) use the
time counter for basic performance analysis as well as fine grained
time-keeping.

Add support to the hwprobe system call to export the time CSR
frequency to code running in U-mode.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Punit Agrawal <punit.agrawal@bytedance.com>
Link: https://lore.kernel.org/r/20240702033731.71955-2-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:51 -07:00
Stuart Menefy
5c8405d763
riscv: Extend sv39 linear mapping max size to 128G
This harmonizes all virtual addressing modes which can now all map
(PGDIR_SIZE * PTRS_PER_PGD) / 4 of physical memory.

The RISCV implementation of KASAN requires that the boundary between
shallow mappings are aligned on an 8G boundary. In this case we need
VMALLOC_START to be 8G aligned. So although we only need to move the
start of the linear mapping down by 4GiB to allow 128GiB to be mapped,
we actually move it down by 8GiB (creating a 4GiB hole between the
linear mapping and KASAN shadow space) to maintain the alignment
requirement.

Signed-off-by: Stuart Menefy <stuart.menefy@codasip.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240630110550.1731929-1-stuart.menefy@codasip.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:50 -07:00
Palmer Dabbelt
3aa1a7d013
Merge patch series "RISC-V: Select ACPI PPTT drivers"
This series adds support for ACPI PPTT via cacheinfo.

* b4-shazam-merge:
  RISC-V: Select ACPI PPTT drivers
  riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT
  riscv: cacheinfo: remove the useless input parameter (node) of ci_leaf_init()

Link: https://lore.kernel.org/r/20240617131425.7526-1-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:49 -07:00
Palmer Dabbelt
ec1dc56b54
Merge patch "Enable SPCR table for console output on RISC-V"
Sia Jee Heng <jeeheng.sia@starfivetech.com> says:

The ACPI SPCR code has been used to enable console output for ARM64 and
X86. The same code can be reused for RISC-V. Furthermore, SPCR table is
mandated for headless system as outlined in the RISC-V BRS
Specification, chapter 6.

* b4-shazam-merge:
  RISC-V: ACPI: Enable SPCR table for console output on RISC-V

Link: https://lore.kernel.org/r/20240502073751.102093-1-jeeheng.sia@starfivetech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:48 -07:00
Jisheng Zhang
b5db73fb18
riscv: enable HAVE_ARCH_STACKLEAK
Add support for the stackleak feature. Whenever the kernel returns to user
space the kernel stack is filled with a poison value.

At the same time, disables the plugin in EFI stub code because EFI stub
is out of scope for the protection.

Tested on qemu and milkv duo:
/ # echo STACKLEAK_ERASING > /sys/kernel/debug/provoke-crash/DIRECT
[   38.675575] lkdtm: Performing direct entry STACKLEAK_ERASING
[   38.678448] lkdtm: stackleak stack usage:
[   38.678448]   high offset: 288 bytes
[   38.678448]   current:     496 bytes
[   38.678448]   lowest:      1328 bytes
[   38.678448]   tracked:     1328 bytes
[   38.678448]   untracked:   448 bytes
[   38.678448]   poisoned:    14312 bytes
[   38.678448]   low offset:  8 bytes
[   38.689887] lkdtm: OK: the rest of the thread stack is properly erased

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240623235316.2010-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:47 -07:00
Zhongqiu Han
1d20e5d437
riscv: signal: Remove unlikely() from WARN_ON() condition
"WARN_ON(unlikely(x))" is excessive. WARN_ON() already uses unlikely()
internally.

Signed-off-by: Zhongqiu Han <quic_zhonhan@quicinc.com>
Reviewed-by: Bjorn Andersson <quic_bjorande@quicinc.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Link: https://lore.kernel.org/r/20240620033434.3778156-1-quic_zhonhan@quicinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:46 -07:00
Anton Blanchard
5d5fc33ce5
riscv: Improve exception and system call latency
Many CPUs implement return address branch prediction as a stack. The
RISCV architecture refers to this as a return address stack (RAS). If
this gets corrupted then the CPU will mispredict at least one but
potentally many function returns.

There are two issues with the current RISCV exception code:

- We are using the alternate link stack (x5/t0) for the indirect branch
  which makes the hardware think this is a function return. This will
  corrupt the RAS.

- We modify the return address of handle_exception to point to
  ret_from_exception. This will also corrupt the RAS.

Testing the null system call latency before and after the patch:

Visionfive2 (StarFive JH7110 / U74)
baseline: 189.87 ns
patched:  176.76 ns

Lichee pi 4a (T-Head TH1520 / C910)
baseline: 666.58 ns
patched:  636.90 ns

Just over 7% on the U74 and just over 4% on the C910.

Signed-off-by: Anton Blanchard <antonb@tenstorrent.com>
Signed-off-by: Cyril Bur <cyrilbur@tenstorrent.com>
Tested-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Jisheng Zhang <jszhang@kernel.org>
Link: https://lore.kernel.org/r/20240607061335.2197383-1-cyrilbur@tenstorrent.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-26 05:50:45 -07:00
Yunhui Cui
66381d3677
RISC-V: Select ACPI PPTT drivers
After adding ACPI support to populate_cache_leaves(), RISC-V can build
cacheinfo through the ACPI PPTT table, thus enabling the ACPI_PPTT
configuration.

Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Link: https://lore.kernel.org/r/20240617131425.7526-3-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-24 07:39:37 -07:00
Yunhui Cui
604f32ea69
riscv: cacheinfo: initialize cacheinfo's level and type from ACPI PPTT
Before cacheinfo can be built correctly, we need to initialize level
and type. Since RISC-V currently does not have a register group that
describes cache-related attributes like ARM64, we cannot obtain them
directly, so now we obtain cache leaves from the ACPI PPTT table
(acpi_get_cache_info()) and set the cache type through split_levels.

Suggested-by: Jeremy Linton <jeremy.linton@arm.com>
Suggested-by: Sudeep Holla <sudeep.holla@arm.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240617131425.7526-2-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-24 07:39:36 -07:00
Yunhui Cui
ee3fab10cb
riscv: cacheinfo: remove the useless input parameter (node) of ci_leaf_init()
ci_leaf_init() is a declared static function. The implementation of the
function body and the caller do not use the parameter (struct device_node
*node) input parameter, so remove it.

Fixes: 6a24915145 ("Revert "riscv: Set more data to cacheinfo"")
Signed-off-by: Yunhui Cui <cuiyunhui@bytedance.com>
Reviewed-by: Jeremy Linton <jeremy.linton@arm.com>
Reviewed-by: Sudeep Holla <sudeep.holla@arm.com>
Link: https://lore.kernel.org/r/20240617131425.7526-1-cuiyunhui@bytedance.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-24 07:39:35 -07:00
Sia Jee Heng
38738947db
RISC-V: ACPI: Enable SPCR table for console output on RISC-V
The ACPI SPCR code has been used to enable console output for ARM64 and
X86. The same code can be reused for RISC-V. Furthermore, SPCR table is
mandated for headless system as outlined in the RISC-V BRS
Specification, chapter 6.

Signed-off-by: Sia Jee Heng <jeeheng.sia@starfivetech.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Link: https://lore.kernel.org/r/20240502073751.102093-2-jeeheng.sia@starfivetech.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-24 07:33:37 -07:00
Jisheng Zhang
8d22d0db5b
riscv: boot: remove duplicated targets line
The "targets:" is duplicated in another line, remove the one with less
targets.

Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Link: https://lore.kernel.org/r/20240613153053.3835-1-jszhang@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-24 06:14:06 -07:00
Jinjie Ruan
3308172276
trace: riscv: Remove deprecated kprobe on ftrace support
Since commit 7caa976546 ("ftrace: riscv: move from REGS to ARGS"),
kprobe on ftrace is not supported by riscv, because riscv's support for
FTRACE_WITH_REGS has been replaced with support for FTRACE_WITH_ARGS, and
KPROBES_ON_FTRACE will be supplanted by FPROBES. So remove the deprecated
kprobe on ftrace support, which is misunderstood.

Signed-off-by: Jinjie Ruan <ruanjinjie@huawei.com>
Acked-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Link: https://lore.kernel.org/r/20240613111347.1745379-1-ruanjinjie@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-24 06:14:05 -07:00
Linus Torvalds
ca83c61cb3 Kbuild updates for v6.11
- Remove tristate choice support from Kconfig
 
  - Stop using the PROVIDE() directive in the linker script
 
  - Reduce the number of links for the combination of CONFIG_DEBUG_INFO_BTF
    and CONFIG_KALLSYMS
 
  - Enable the warning for symbol reference to .exit.* sections by default
 
  - Fix warnings in RPM package builds
 
  - Improve scripts/make_fit.py to generate a FIT image with separate base
    DTB and overlays
 
  - Improve choice value calculation in Kconfig
 
  - Fix conditional prompt behavior in choice in Kconfig
 
  - Remove support for the uncommon EMAIL environment variable in Debian
    package builds
 
  - Remove support for the uncommon "name <email>" form for the DEBEMAIL
    environment variable
 
  - Raise the minimum supported GNU Make version to 4.0
 
  - Remove stale code for the absolute kallsyms
 
  - Move header files commonly used for host programs to scripts/include/
 
  - Introduce the pacman-pkg target to generate a pacman package used in
    Arch Linux
 
  - Clean up Kconfig
 -----BEGIN PGP SIGNATURE-----
 
 iQJJBAABCgAzFiEEbmPs18K1szRHjPqEPYsBB53g2wYFAmagBLUVHG1hc2FoaXJv
 eUBrZXJuZWwub3JnAAoJED2LAQed4NsGmoUQAJ8pnURs0g+Rcyk6bdY/qtXBYkS+
 nXpIK1ssFgRRgAQdeszYtvBqLFzb0wRCSie87G1AriD/JkVVTjCCY1For1y+vs0u
 a7HfxitHhZpPyZW/T+WMQ3LViNccpkx+DFAcoRH8xOY/XPEJKVUby332jOIXMuyg
 +NKIELQJVsLhcDofTUGb5VfIQektw219n5c4jKjXdNk4ZtE24xCRM5X528ZebwWJ
 RZhMvJ968PyIH1IRXvNt6dsKBxoGIwPP8IO6yW9hzHaNsBqt7MGSChSel7r1VKpk
 iwCNApJvEiVBe5wvTSVOVro7/8p/AZ70CQAqnMJV+dNnRqtGqW7NvL6XAjZRJgJJ
 Uxe5NSrXgQd3FtqfcbXLetBgp9zGVt328nHm1HXHR5rFsvoOiTvO7hHPbhA+OoWJ
 fs+jHzEXdAMRgsNrczPWU5Svq6MgGe4v8HBf0m8N1Uy65t/O+z9ti2QAw7kIFlbu
 /VSFNjw4CHmNxGhnH0khCMsy85FwVIt9Ux+2d6IEc0gP8S1Qa1HgHGAoVI4U51eS
 9dxEPVJNPOugaIVHheuS3wimEO6wzaJcQHn4IXaasMA7P6Yo4G/jiGoy4cb9qPTM
 Hb+GaOltUy7vDoG4D2LSym8zR8rdKwbIf/5psdZrq/IWVKq5p+p7KWs3aOykSoM7
 o6Hb532Ioalhm8je
 =BYu7
 -----END PGP SIGNATURE-----

Merge tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild

Pull Kbuild updates from Masahiro Yamada:

 - Remove tristate choice support from Kconfig

 - Stop using the PROVIDE() directive in the linker script

 - Reduce the number of links for the combination of CONFIG_KALLSYMS and
   CONFIG_DEBUG_INFO_BTF

 - Enable the warning for symbol reference to .exit.* sections by
   default

 - Fix warnings in RPM package builds

 - Improve scripts/make_fit.py to generate a FIT image with separate
   base DTB and overlays

 - Improve choice value calculation in Kconfig

 - Fix conditional prompt behavior in choice in Kconfig

 - Remove support for the uncommon EMAIL environment variable in Debian
   package builds

 - Remove support for the uncommon "name <email>" form for the DEBEMAIL
   environment variable

 - Raise the minimum supported GNU Make version to 4.0

 - Remove stale code for the absolute kallsyms

 - Move header files commonly used for host programs to scripts/include/

 - Introduce the pacman-pkg target to generate a pacman package used in
   Arch Linux

 - Clean up Kconfig

* tag 'kbuild-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/masahiroy/linux-kbuild: (65 commits)
  kbuild: doc: gcc to CC change
  kallsyms: change sym_entry::percpu_absolute to bool type
  kallsyms: unify seq and start_pos fields of struct sym_entry
  kallsyms: add more original symbol type/name in comment lines
  kallsyms: use \t instead of a tab in printf()
  kallsyms: avoid repeated calculation of array size for markers
  kbuild: add script and target to generate pacman package
  modpost: use generic macros for hash table implementation
  kbuild: move some helper headers from scripts/kconfig/ to scripts/include/
  Makefile: add comment to discourage tools/* addition for kernel builds
  kbuild: clean up scripts/remove-stale-files
  kconfig: recursive checks drop file/lineno
  kbuild: rpm-pkg: introduce a simple changelog section for kernel.spec
  kallsyms: get rid of code for absolute kallsyms
  kbuild: Create INSTALL_PATH directory if it does not exist
  kbuild: Abort make on install failures
  kconfig: remove 'e1' and 'e2' macros from expression deduplication
  kconfig: remove SYMBOL_CHOICEVAL flag
  kconfig: add const qualifiers to several function arguments
  kconfig: call expr_eliminate_yn() at least once in expr_eliminate_dups()
  ...
2024-07-23 14:32:21 -07:00
Palmer Dabbelt
b9a603da42
Merge patch series "riscv: Separate vendor extensions from standard extensions"
Charlie Jenkins <charlie@rivosinc.com> says:

All extensions, both standard and vendor, live in one struct
"riscv_isa_ext". There is currently one vendor extension, xandespmu, but
it is likely that more vendor extensions will be added to the kernel in
the future. As more vendor extensions (and standard extensions) are
added, riscv_isa_ext will become more bloated with a mix of vendor and
standard extensions.

This also allows each vendor to be conditionally enabled through
Kconfig.

* b4-shazam-merge:
  riscv: cpufeature: Extract common elements from extension checking
  riscv: Introduce vendor variants of extension helpers
  riscv: Add vendor extensions to /proc/cpuinfo
  riscv: Extend cpufeature.c to detect vendor extensions

Link: https://lore.kernel.org/r/20240719-support_vendor_extensions-v3-0-0af7587bbec0@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-22 15:37:01 -07:00
Charlie Jenkins
d4c8d79f51
riscv: cpufeature: Extract common elements from extension checking
The __riscv_has_extension_likely() and __riscv_has_extension_unlikely()
functions from the vendor_extensions.h can be used to simplify the
standard extension checking code as well. Migrate those functions to
cpufeature.h and reorganize the code in the file to use the functions.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Link: https://lore.kernel.org/r/20240719-support_vendor_extensions-v3-4-0af7587bbec0@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-22 15:36:57 -07:00
Charlie Jenkins
0f24254111
riscv: Introduce vendor variants of extension helpers
Vendor extensions are maintained in per-vendor structs (separate from
standard extensions which live in riscv_isa). Create vendor variants for
the existing extension helpers to interface with the riscv_isa_vendor
bitmaps.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Link: https://lore.kernel.org/r/20240719-support_vendor_extensions-v3-3-0af7587bbec0@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-22 15:36:56 -07:00
Charlie Jenkins
9448d9accd
riscv: Add vendor extensions to /proc/cpuinfo
All of the supported vendor extensions that have been listed in
riscv_isa_vendor_ext_list can be exported through /proc/cpuinfo.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240719-support_vendor_extensions-v3-2-0af7587bbec0@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-22 15:36:55 -07:00
Charlie Jenkins
23c996fc2b
riscv: Extend cpufeature.c to detect vendor extensions
Instead of grouping all vendor extensions into the same riscv_isa_ext
that standard instructions use, create a struct
"riscv_isa_vendor_ext_data_list" that allows each vendor to maintain
their vendor extensions independently of the standard extensions.
xandespmu is currently the only vendor extension so that is the only
extension that is affected by this change.

An additional benefit of this is that the extensions of each vendor can
be conditionally enabled. A config RISCV_ISA_VENDOR_EXT_ANDES has been
added to allow for that.

Signed-off-by: Charlie Jenkins <charlie@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Tested-by: Yu Chien Peter Lin <peterlin@andestech.com>
Reviewed-by: Yu Chien Peter Lin <peterlin@andestech.com>
Link: https://lore.kernel.org/r/20240719-support_vendor_extensions-v3-1-0af7587bbec0@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-22 15:36:54 -07:00
Conor Dooley
82b4616806
RISC-V: run savedefconfig for defconfig
It's been a while since this was run, and there's a few things that have
changed. Firstly, almost all of the Renesas stuff vanishes because the
config for the RZ/Five is gated behind NONPORTABLE. Several options
(like CONFIG_PM) are removed as they are the default values.

To retain DEFVFREQ_THERMAL and BLK_DEV_THROTTLING, add PM_DEVFREQ and
BLK_CGROUP respectively.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240717-shrubs-concise-51600886babf@spud
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-22 10:57:47 -07:00
Conor Dooley
3d8d459c8b
RISC-V: hwprobe: sort EXT_KEY()s in hwprobe_isa_ext0() alphabetically
Currently the entries appear to be in a random order (although according
to Palmer he has tried to sort them by key value) which makes it harder
to find entries in a growing list, and more likely to have conflicts as
all patches are adding to the end of the list. Sort them alphabetically
instead.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20240717-dedicate-squeamish-7e4ab54df58f@spud
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-22 10:57:41 -07:00
Palmer Dabbelt
6a4aa4c94b
Merge patch series "Add ACPI NUMA support for RISC-V"
Haibo Xu <haibo1.xu@intel.com> says:

This patch series enable RISC-V ACPI NUMA support which was based on
the recently approved ACPI ECR[1].

Patch 1/4 add RISC-V specific acpi_numa.c file to parse NUMA information
from SRAT and SLIT ACPI tables.
Patch 2/4 add the common SRAT RINTC affinity structure handler.
Patch 3/4 change the ACPI_NUMA to a hidden option since it would be selected
by default on all supported platform.
Patch 4/4 replace pr_info with pr_debug in arch_acpi_numa_init() to avoid
potential boot noise on ACPI platforms that are not NUMA.

Based-on: https://github.com/linux-riscv/linux-riscv/tree/for-next

[1] https://drive.google.com/file/d/1YTdDx2IPm5IeZjAW932EYU-tUtgS08tX/view?usp=sharing

Testing:
Since the ACPI AIA/PLIC support patch set is still under upstream review,
hence it is tested using the poll based HVC SBI console and RAM disk.
1) Build latest Qemu with the following patch backported
   42bd4eeefd

2) Build latest EDK-II
   https://github.com/tianocore/edk2/blob/master/OvmfPkg/RiscVVirt/README.md

3) Build Linux with the following configs enabled
   CONFIG_RISCV_SBI_V01=y
   CONFIG_SERIAL_EARLYCON_RISCV_SBI=y
   CONFIG_NONPORTABLE=y
   CONFIG_HVC_RISCV_SBI=y
   CONFIG_NUMA=y
   CONFIG_ACPI_NUMA=y

4) Build buildroot rootfs.cpio

5) Launch the Qemu machine
   qemu-system-riscv64 -nographic \
   -machine virt,pflash0=pflash0,pflash1=pflash1 -smp 4 -m 8G \
   -blockdev node-name=pflash0,driver=file,read-only=on,filename=RISCV_VIRT_CODE.fd \
   -blockdev node-name=pflash1,driver=file,filename=RISCV_VIRT_VARS.fd \
   -object memory-backend-ram,size=4G,id=m0 \
   -object memory-backend-ram,size=4G,id=m1 \
   -numa node,memdev=m0,cpus=0-1,nodeid=0 \
   -numa node,memdev=m1,cpus=2-3,nodeid=1 \
   -numa dist,src=0,dst=1,val=30 \
   -kernel linux/arch/riscv/boot/Image \
   -initrd buildroot/output/images/rootfs.cpio \
   -append "root=/dev/ram ro console=hvc0 earlycon=sbi"

[    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x80000000-0x17fffffff]
[    0.000000] ACPI: SRAT: Node 1 PXM 1 [mem 0x180000000-0x27fffffff]
[    0.000000] NUMA: NODE_DATA [mem 0x17fe3bc40-0x17fe3cfff]
[    0.000000] NUMA: NODE_DATA [mem 0x27fff4c40-0x27fff5fff]
...
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> HARTID 0x0 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 0 -> HARTID 0x1 -> Node 0
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> HARTID 0x2 -> Node 1
[    0.000000] ACPI: NUMA: SRAT: PXM 1 -> HARTID 0x3 -> Node 1

* b4-shazam-merge:
  ACPI: NUMA: replace pr_info with pr_debug in arch_acpi_numa_init
  ACPI: NUMA: change the ACPI_NUMA to a hidden option
  ACPI: NUMA: Add handler for SRAT RINTC affinity structure
  ACPI: RISCV: Add NUMA support based on SRAT and SLIT

Link: https://lore.kernel.org/r/cover.1718268003.git.haibo1.xu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-22 10:31:51 -07:00
Haibo Xu
eabd9db64e
ACPI: RISCV: Add NUMA support based on SRAT and SLIT
Add acpi_numa.c file to enable parse NUMA information from
ACPI SRAT and SLIT tables. SRAT table provide CPUs(Hart) and
memory nodes to proximity domain mapping, while SLIT table
provide the distance metrics between proximity domains.

Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Sunil V L <sunilvl@ventanamicro.com>
Reviewed-by: Hanjun Guo <guohanjun@huawei.com>
Link: https://lore.kernel.org/r/65dbad1fda08a32922c44886e4581e49b4a2fecc.1718268003.git.haibo1.xu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-22 07:13:06 -07:00
Linus Torvalds
fbc90c042c - 875fa64577da ("mm/hugetlb_vmemmap: fix race with speculative PFN
walkers") is known to cause a performance regression
   (https://lore.kernel.org/all/3acefad9-96e5-4681-8014-827d6be71c7a@linux.ibm.com/T/#mfa809800a7862fb5bdf834c6f71a3a5113eb83ff).
   Yu has a fix which I'll send along later via the hotfixes branch.
 
 - In the series "mm: Avoid possible overflows in dirty throttling" Jan
   Kara addresses a couple of issues in the writeback throttling code.
   These fixes are also targetted at -stable kernels.
 
 - Ryusuke Konishi's series "nilfs2: fix potential issues related to
   reserved inodes" does that.  This should actually be in the
   mm-nonmm-stable tree, along with the many other nilfs2 patches.  My bad.
 
 - More folio conversions from Kefeng Wang in the series "mm: convert to
   folio_alloc_mpol()"
 
 - Kemeng Shi has sent some cleanups to the writeback code in the series
   "Add helper functions to remove repeated code and improve readability of
   cgroup writeback"
 
 - Kairui Song has made the swap code a little smaller and a little
   faster in the series "mm/swap: clean up and optimize swap cache index".
 
 - In the series "mm/memory: cleanly support zeropage in
   vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David
   Hildenbrand has reworked the rather sketchy handling of the use of the
   zeropage in MAP_SHARED mappings.  I don't see any runtime effects here -
   more a cleanup/understandability/maintainablity thing.
 
 - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling of
   higher addresses, for aarch64.  The (poorly named) series is
   "Restructure va_high_addr_switch".
 
 - The core TLB handling code gets some cleanups and possible slight
   optimizations in Bang Li's series "Add update_mmu_tlb_range() to
   simplify code".
 
 - Jane Chu has improved the handling of our
   fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in the
   series "Enhance soft hwpoison handling and injection".
 
 - Jeff Johnson has sent a billion patches everywhere to add
   MODULE_DESCRIPTION() to everything.  Some landed in this pull.
 
 - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang has
   simplified migration's use of hardware-offload memory copying.
 
 - Yosry Ahmed performs more folio API conversions in his series "mm:
   zswap: trivial folio conversions".
 
 - In the series "large folios swap-in: handle refault cases first",
   Chuanhua Han inches us forward in the handling of large pages in the
   swap code.  This is a cleanup and optimization, working toward the end
   objective of full support of large folio swapin/out.
 
 - In the series "mm,swap: cleanup VMA based swap readahead window
   calculation", Huang Ying has contributed some cleanups and a possible
   fixlet to his VMA based swap readahead code.
 
 - In the series "add mTHP support for anonymous shmem" Baolin Wang has
   taught anonymous shmem mappings to use multisize THP.  By default this
   is a no-op - users must opt in vis sysfs controls.  Dramatic
   improvements in pagefault latency are realized.
 
 - David Hildenbrand has some cleanups to our remaining use of
   page_mapcount() in the series "fs/proc: move page_mapcount() to
   fs/proc/internal.h".
 
 - David also has some highmem accounting cleanups in the series
   "mm/highmem: don't track highmem pages manually".
 
 - Build-time fixes and cleanups from John Hubbard in the series
   "cleanups, fixes, and progress towards avoiding "make headers"".
 
 - Cleanups and consolidation of the core pagemap handling from Barry
   Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers
   and utilize them".
 
 - Lance Yang's series "Reclaim lazyfree THP without splitting" has
   reduced the latency of the reclaim of pmd-mapped THPs under fairly
   common circumstances.  A 10x speedup is seen in a microbenchmark.
 
   It does this by punting to aother CPU but I guess that's a win unless
   all CPUs are pegged.
 
 - hugetlb_cgroup cleanups from Xiu Jianfeng in the series
   "mm/hugetlb_cgroup: rework on cftypes".
 
 - Miaohe Lin's series "Some cleanups for memory-failure" does just that
   thing.
 
 - Is anyone reading this stuff?  If so, email me!
 
 - Someone other than SeongJae has developed a DAMON feature in Honggyu
   Kim's series "DAMON based tiered memory management for CXL memory".
   This adds DAMON features which may be used to help determine the
   efficiency of our placement of CXL/PCIe attached DRAM.
 
 - DAMON user API centralization and simplificatio work in SeongJae
   Park's series "mm/damon: introduce DAMON parameters online commit
   function".
 
 - In the series "mm: page_type, zsmalloc and page_mapcount_reset()"
   David Hildenbrand does some maintenance work on zsmalloc - partially
   modernizing its use of pageframe fields.
 
 - Kefeng Wang provides more folio conversions in the series "mm: remove
   page_maybe_dma_pinned() and page_mkclean()".
 
 - More cleanup from David Hildenbrand, this time in the series
   "mm/memory_hotplug: use PageOffline() instead of PageReserved() for
   !ZONE_DEVICE".  It "enlightens memory hotplug more about PageOffline()
   pages" and permits the removal of some virtio-mem hacks.
 
 - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and
   __folio_add_anon_rmap()" is a cleanup to the anon folio handling in
   preparation for mTHP (multisize THP) swapin.
 
 - Kefeng Wang's series "mm: improve clear and copy user folio"
   implements more folio conversions, this time in the area of large folio
   userspace copying.
 
 - The series "Docs/mm/damon/maintaier-profile: document a mailing tool
   and community meetup series" tells people how to get better involved
   with other DAMON developers.  From SeongJae Park.
 
 - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does
   that.
 
 - David Hildenbrand sends along more cleanups, this time against the
   migration code.  The series is "mm/migrate: move NUMA hinting fault
   folio isolation + checks under PTL".
 
 - Jan Kara has found quite a lot of strangenesses and minor errors in
   the readahead code.  He addresses this in the series "mm: Fix various
   readahead quirks".
 
 - SeongJae Park's series "selftests/damon: test DAMOS tried regions and
   {min,max}_nr_regions" adds features and addresses errors in DAMON's self
   testing code.
 
 - Gavin Shan has found a userspace-triggerable WARN in the pagecache
   code.  The series "mm/filemap: Limit page cache size to that supported
   by xarray" addresses this.  The series is marked cc:stable.
 
 - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations
   and cleanup" cleans up and slightly optimizes KSM.
 
 - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of
   code motion.  The series (which also makes the memcg-v1 code
   Kconfigurable) are
 
   "mm: memcg: separate legacy cgroup v1 code and put under config
   option" and
   "mm: memcg: put cgroup v1-specific memcg data under CONFIG_MEMCG_V1"
 
 - Dan Schatzberg's series "Add swappiness argument to memory.reclaim"
   adds an additional feature to this cgroup-v2 control file.
 
 - The series "Userspace controls soft-offline pages" from Jiaqi Yan
   permits userspace to stop the kernel's automatic treatment of excessive
   correctable memory errors.  In order to permit userspace to monitor and
   handle this situation.
 
 - Kefeng Wang's series "mm: migrate: support poison recover from migrate
   folio" teaches the kernel to appropriately handle migration from
   poisoned source folios rather than simply panicing.
 
 - SeongJae Park's series "Docs/damon: minor fixups and improvements"
   does those things.
 
 - In the series "mm/zsmalloc: change back to per-size_class lock"
   Chengming Zhou improves zsmalloc's scalability and memory utilization.
 
 - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for
   pinning memfd folios" makes the GUP code use FOLL_PIN rather than bare
   refcount increments.  So these paes can first be moved aside if they
   reside in the movable zone or a CMA block.
 
 - Andrii Nakryiko has added a binary ioctl()-based API to /proc/pid/maps
   for much faster reading of vma information.  The series is "query VMAs
   from /proc/<pid>/maps".
 
 - In the series "mm: introduce per-order mTHP split counters" Lance Yang
   improves the kernel's presentation of developer information related to
   multisize THP splitting.
 
 - Michael Ellerman has developed the series "Reimplement huge pages
   without hugepd on powerpc (8xx, e500, book3s/64)".  This permits
   userspace to use all available huge page sizes.
 
 - In the series "revert unconditional slab and page allocator fault
   injection calls" Vlastimil Babka removes a performance-affecting and not
   very useful feature from slab fault injection.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZp2C+QAKCRDdBJ7gKXxA
 joTkAQDvjqOoFStqk4GU3OXMYB7WCU/ZQMFG0iuu1EEwTVDZ4QEA8CnG7seek1R3
 xEoo+vw0sWWeLV3qzsxnCA1BJ8cTJA8=
 =z0Lf
 -----END PGP SIGNATURE-----

Merge tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull MM updates from Andrew Morton:

 - In the series "mm: Avoid possible overflows in dirty throttling" Jan
   Kara addresses a couple of issues in the writeback throttling code.
   These fixes are also targetted at -stable kernels.

 - Ryusuke Konishi's series "nilfs2: fix potential issues related to
   reserved inodes" does that. This should actually be in the
   mm-nonmm-stable tree, along with the many other nilfs2 patches. My
   bad.

 - More folio conversions from Kefeng Wang in the series "mm: convert to
   folio_alloc_mpol()"

 - Kemeng Shi has sent some cleanups to the writeback code in the series
   "Add helper functions to remove repeated code and improve readability
   of cgroup writeback"

 - Kairui Song has made the swap code a little smaller and a little
   faster in the series "mm/swap: clean up and optimize swap cache
   index".

 - In the series "mm/memory: cleanly support zeropage in
   vm_insert_page*(), vm_map_pages*() and vmf_insert_mixed()" David
   Hildenbrand has reworked the rather sketchy handling of the use of
   the zeropage in MAP_SHARED mappings. I don't see any runtime effects
   here - more a cleanup/understandability/maintainablity thing.

 - Dev Jain has improved selftests/mm/va_high_addr_switch.c's handling
   of higher addresses, for aarch64. The (poorly named) series is
   "Restructure va_high_addr_switch".

 - The core TLB handling code gets some cleanups and possible slight
   optimizations in Bang Li's series "Add update_mmu_tlb_range() to
   simplify code".

 - Jane Chu has improved the handling of our
   fake-an-unrecoverable-memory-error testing feature MADV_HWPOISON in
   the series "Enhance soft hwpoison handling and injection".

 - Jeff Johnson has sent a billion patches everywhere to add
   MODULE_DESCRIPTION() to everything. Some landed in this pull.

 - In the series "mm: cleanup MIGRATE_SYNC_NO_COPY mode", Kefeng Wang
   has simplified migration's use of hardware-offload memory copying.

 - Yosry Ahmed performs more folio API conversions in his series "mm:
   zswap: trivial folio conversions".

 - In the series "large folios swap-in: handle refault cases first",
   Chuanhua Han inches us forward in the handling of large pages in the
   swap code. This is a cleanup and optimization, working toward the end
   objective of full support of large folio swapin/out.

 - In the series "mm,swap: cleanup VMA based swap readahead window
   calculation", Huang Ying has contributed some cleanups and a possible
   fixlet to his VMA based swap readahead code.

 - In the series "add mTHP support for anonymous shmem" Baolin Wang has
   taught anonymous shmem mappings to use multisize THP. By default this
   is a no-op - users must opt in vis sysfs controls. Dramatic
   improvements in pagefault latency are realized.

 - David Hildenbrand has some cleanups to our remaining use of
   page_mapcount() in the series "fs/proc: move page_mapcount() to
   fs/proc/internal.h".

 - David also has some highmem accounting cleanups in the series
   "mm/highmem: don't track highmem pages manually".

 - Build-time fixes and cleanups from John Hubbard in the series
   "cleanups, fixes, and progress towards avoiding "make headers"".

 - Cleanups and consolidation of the core pagemap handling from Barry
   Song in the series "mm: introduce pmd|pte_needs_soft_dirty_wp helpers
   and utilize them".

 - Lance Yang's series "Reclaim lazyfree THP without splitting" has
   reduced the latency of the reclaim of pmd-mapped THPs under fairly
   common circumstances. A 10x speedup is seen in a microbenchmark.

   It does this by punting to aother CPU but I guess that's a win unless
   all CPUs are pegged.

 - hugetlb_cgroup cleanups from Xiu Jianfeng in the series
   "mm/hugetlb_cgroup: rework on cftypes".

 - Miaohe Lin's series "Some cleanups for memory-failure" does just that
   thing.

 - Someone other than SeongJae has developed a DAMON feature in Honggyu
   Kim's series "DAMON based tiered memory management for CXL memory".
   This adds DAMON features which may be used to help determine the
   efficiency of our placement of CXL/PCIe attached DRAM.

 - DAMON user API centralization and simplificatio work in SeongJae
   Park's series "mm/damon: introduce DAMON parameters online commit
   function".

 - In the series "mm: page_type, zsmalloc and page_mapcount_reset()"
   David Hildenbrand does some maintenance work on zsmalloc - partially
   modernizing its use of pageframe fields.

 - Kefeng Wang provides more folio conversions in the series "mm: remove
   page_maybe_dma_pinned() and page_mkclean()".

 - More cleanup from David Hildenbrand, this time in the series
   "mm/memory_hotplug: use PageOffline() instead of PageReserved() for
   !ZONE_DEVICE". It "enlightens memory hotplug more about PageOffline()
   pages" and permits the removal of some virtio-mem hacks.

 - Barry Song's series "mm: clarify folio_add_new_anon_rmap() and
   __folio_add_anon_rmap()" is a cleanup to the anon folio handling in
   preparation for mTHP (multisize THP) swapin.

 - Kefeng Wang's series "mm: improve clear and copy user folio"
   implements more folio conversions, this time in the area of large
   folio userspace copying.

 - The series "Docs/mm/damon/maintaier-profile: document a mailing tool
   and community meetup series" tells people how to get better involved
   with other DAMON developers. From SeongJae Park.

 - A large series ("kmsan: Enable on s390") from Ilya Leoshkevich does
   that.

 - David Hildenbrand sends along more cleanups, this time against the
   migration code. The series is "mm/migrate: move NUMA hinting fault
   folio isolation + checks under PTL".

 - Jan Kara has found quite a lot of strangenesses and minor errors in
   the readahead code. He addresses this in the series "mm: Fix various
   readahead quirks".

 - SeongJae Park's series "selftests/damon: test DAMOS tried regions and
   {min,max}_nr_regions" adds features and addresses errors in DAMON's
   self testing code.

 - Gavin Shan has found a userspace-triggerable WARN in the pagecache
   code. The series "mm/filemap: Limit page cache size to that supported
   by xarray" addresses this. The series is marked cc:stable.

 - Chengming Zhou's series "mm/ksm: cmp_and_merge_page() optimizations
   and cleanup" cleans up and slightly optimizes KSM.

 - Roman Gushchin has separated the memcg-v1 and memcg-v2 code - lots of
   code motion. The series (which also makes the memcg-v1 code
   Kconfigurable) are "mm: memcg: separate legacy cgroup v1 code and put
   under config option" and "mm: memcg: put cgroup v1-specific memcg
   data under CONFIG_MEMCG_V1"

 - Dan Schatzberg's series "Add swappiness argument to memory.reclaim"
   adds an additional feature to this cgroup-v2 control file.

 - The series "Userspace controls soft-offline pages" from Jiaqi Yan
   permits userspace to stop the kernel's automatic treatment of
   excessive correctable memory errors. In order to permit userspace to
   monitor and handle this situation.

 - Kefeng Wang's series "mm: migrate: support poison recover from
   migrate folio" teaches the kernel to appropriately handle migration
   from poisoned source folios rather than simply panicing.

 - SeongJae Park's series "Docs/damon: minor fixups and improvements"
   does those things.

 - In the series "mm/zsmalloc: change back to per-size_class lock"
   Chengming Zhou improves zsmalloc's scalability and memory
   utilization.

 - Vivek Kasireddy's series "mm/gup: Introduce memfd_pin_folios() for
   pinning memfd folios" makes the GUP code use FOLL_PIN rather than
   bare refcount increments. So these paes can first be moved aside if
   they reside in the movable zone or a CMA block.

 - Andrii Nakryiko has added a binary ioctl()-based API to
   /proc/pid/maps for much faster reading of vma information. The series
   is "query VMAs from /proc/<pid>/maps".

 - In the series "mm: introduce per-order mTHP split counters" Lance
   Yang improves the kernel's presentation of developer information
   related to multisize THP splitting.

 - Michael Ellerman has developed the series "Reimplement huge pages
   without hugepd on powerpc (8xx, e500, book3s/64)". This permits
   userspace to use all available huge page sizes.

 - In the series "revert unconditional slab and page allocator fault
   injection calls" Vlastimil Babka removes a performance-affecting and
   not very useful feature from slab fault injection.

* tag 'mm-stable-2024-07-21-14-50' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (411 commits)
  mm/mglru: fix ineffective protection calculation
  mm/zswap: fix a white space issue
  mm/hugetlb: fix kernel NULL pointer dereference when migrating hugetlb folio
  mm/hugetlb: fix possible recursive locking detected warning
  mm/gup: clear the LRU flag of a page before adding to LRU batch
  mm/numa_balancing: teach mpol_to_str about the balancing mode
  mm: memcg1: convert charge move flags to unsigned long long
  alloc_tag: fix page_ext_get/page_ext_put sequence during page splitting
  lib: reuse page_ext_data() to obtain codetag_ref
  lib: add missing newline character in the warning message
  mm/mglru: fix overshooting shrinker memory
  mm/mglru: fix div-by-zero in vmpressure_calc_level()
  mm/kmemleak: replace strncpy() with strscpy()
  mm, page_alloc: put should_fail_alloc_page() back behing CONFIG_FAIL_PAGE_ALLOC
  mm, slab: put should_failslab() back behind CONFIG_SHOULD_FAILSLAB
  mm: ignore data-race in __swap_writepage
  hugetlbfs: ensure generic_hugetlb_get_unmapped_area() returns higher address than mmap_min_addr
  mm: shmem: rename mTHP shmem counters
  mm: swap_state: use folio_alloc_mpol() in __read_swap_cache_async()
  mm/migrate: putback split folios when numa hint migration fails
  ...
2024-07-21 17:15:46 -07:00
Linus Torvalds
2c9b351240 ARM:
* Initial infrastructure for shadow stage-2 MMUs, as part of nested
   virtualization enablement
 
 * Support for userspace changes to the guest CTR_EL0 value, enabling
   (in part) migration of VMs between heterogenous hardware
 
 * Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1 of
   the protocol
 
 * FPSIMD/SVE support for nested, including merged trap configuration
   and exception routing
 
 * New command-line parameter to control the WFx trap behavior under KVM
 
 * Introduce kCFI hardening in the EL2 hypervisor
 
 * Fixes + cleanups for handling presence/absence of FEAT_TCRX
 
 * Miscellaneous fixes + documentation updates
 
 LoongArch:
 
 * Add paravirt steal time support.
 
 * Add support for KVM_DIRTY_LOG_INITIALLY_SET.
 
 * Add perf kvm-stat support for loongarch.
 
 RISC-V:
 
 * Redirect AMO load/store access fault traps to guest
 
 * perf kvm stat support
 
 * Use guest files for IMSIC virtualization, when available
 
 ONE_REG support for the Zimop, Zcmop, Zca, Zcf, Zcd, Zcb and Zawrs ISA
 extensions is coming through the RISC-V tree.
 
 s390:
 
 * Assortment of tiny fixes which are not time critical
 
 x86:
 
 * Fixes for Xen emulation.
 
 * Add a global struct to consolidate tracking of host values, e.g. EFER
 
 * Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the effective APIC
   bus frequency, because TDX.
 
 * Print the name of the APICv/AVIC inhibits in the relevant tracepoint.
 
 * Clean up KVM's handling of vendor specific emulation to consistently act on
   "compatible with Intel/AMD", versus checking for a specific vendor.
 
 * Drop MTRR virtualization, and instead always honor guest PAT on CPUs
   that support self-snoop.
 
 * Update to the newfangled Intel CPU FMS infrastructure.
 
 * Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as it reads
   '0' and writes from userspace are ignored.
 
 * Misc cleanups
 
 x86 - MMU:
 
 * Small cleanups, renames and refactoring extracted from the upcoming
   Intel TDX support.
 
 * Don't allocate kvm_mmu_page.shadowed_translation for shadow pages that can't
   hold leafs SPTEs.
 
 * Unconditionally drop mmu_lock when allocating TDP MMU page tables for eager
   page splitting, to avoid stalling vCPUs when splitting huge pages.
 
 * Bug the VM instead of simply warning if KVM tries to split a SPTE that is
   non-present or not-huge.  KVM is guaranteed to end up in a broken state
   because the callers fully expect a valid SPTE, it's all but dangerous
   to let more MMU changes happen afterwards.
 
 x86 - AMD:
 
 * Make per-CPU save_area allocations NUMA-aware.
 
 * Force sev_es_host_save_area() to be inlined to avoid calling into an
   instrumentable function from noinstr code.
 
 * Base support for running SEV-SNP guests.  API-wise, this includes
   a new KVM_X86_SNP_VM type, encrypting/measure the initial image into
   guest memory, and finalizing it before launching it.  Internally,
   there are some gmem/mmu hooks needed to prepare gmem-allocated pages
   before mapping them into guest private memory ranges.
 
   This includes basic support for attestation guest requests, enough to
   say that KVM supports the GHCB 2.0 specification.
 
   There is no support yet for loading into the firmware those signing
   keys to be used for attestation requests, and therefore no need yet
   for the host to provide certificate data for those keys.  To support
   fetching certificate data from userspace, a new KVM exit type will be
   needed to handle fetching the certificate from userspace. An attempt to
   define a new KVM_EXIT_COCO/KVM_EXIT_COCO_REQ_CERTS exit type to handle
   this was introduced in v1 of this patchset, but is still being discussed
   by community, so for now this patchset only implements a stub version
   of SNP Extended Guest Requests that does not provide certificate data.
 
 x86 - Intel:
 
 * Remove an unnecessary EPT TLB flush when enabling hardware.
 
 * Fix a series of bugs that cause KVM to fail to detect nested pending posted
   interrupts as valid wake eents for a vCPU executing HLT in L2 (with
   HLT-exiting disable by L1).
 
 * KVM: x86: Suppress MMIO that is triggered during task switch emulation
 
   Explicitly suppress userspace emulated MMIO exits that are triggered when
   emulating a task switch as KVM doesn't support userspace MMIO during
   complex (multi-step) emulation.  Silently ignoring the exit request can
   result in the WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to
   userspace for some other reason prior to purging mmio_needed.
 
   See commit 0dc902267c ("KVM: x86: Suppress pending MMIO write exits if
   emulator detects exception") for more details on KVM's limitations with
   respect to emulated MMIO during complex emulator flows.
 
 Generic:
 
 * Rename the AS_UNMOVABLE flag that was introduced for KVM to AS_INACCESSIBLE,
   because the special casing needed by these pages is not due to just
   unmovability (and in fact they are only unmovable because the CPU cannot
   access them).
 
 * New ioctl to populate the KVM page tables in advance, which is useful to
   mitigate KVM page faults during guest boot or after live migration.
   The code will also be used by TDX, but (probably) not through the ioctl.
 
 * Enable halt poll shrinking by default, as Intel found it to be a clear win.
 
 * Setup empty IRQ routing when creating a VM to avoid having to synchronize
   SRCU when creating a split IRQCHIP on x86.
 
 * Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
   that arch code can use for hooking both sched_in() and sched_out().
 
 * Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
   truncating a bogus value from userspace, e.g. to help userspace detect bugs.
 
 * Mark a vCPU as preempted if and only if it's scheduled out while in the
   KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
   memory when retrieving guest state during live migration blackout.
 
 Selftests:
 
 * Remove dead code in the memslot modification stress test.
 
 * Treat "branch instructions retired" as supported on all AMD Family 17h+ CPUs.
 
 * Print the guest pseudo-RNG seed only when it changes, to avoid spamming the
   log for tests that create lots of VMs.
 
 * Make the PMU counters test less flaky when counting LLC cache misses by
   doing CLFLUSH{OPT} in every loop iteration.
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmaZQB0UHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNkZwf/bv2jiENaLFNGPe/VqTKMQ6PHQLMG
 +sNHx6fJPP35gTM8Jqf0/7/ummZXcSuC1mWrzYbecZm7Oeg3vwNXHZ4LquwwX6Dv
 8dKcUzLbWDAC4WA3SKhi8C8RV2v6E7ohy69NtAJmFWTc7H95dtIQm6cduV2osTC3
 OEuHe1i8d9umk6couL9Qhm8hk3i9v2KgCsrfyNrQgLtS3hu7q6yOTR8nT0iH6sJR
 KE5A8prBQgLmF34CuvYDw4Hu6E4j+0QmIqodovg2884W1gZQ9LmcVqYPaRZGsG8S
 iDdbkualLKwiR1TpRr3HJGKWSFdc7RblbsnHRvHIZgFsMQiimh4HrBSCyQ==
 =zepX
 -----END PGP SIGNATURE-----

Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm

Pull kvm updates from Paolo Bonzini:
 "ARM:

   - Initial infrastructure for shadow stage-2 MMUs, as part of nested
     virtualization enablement

   - Support for userspace changes to the guest CTR_EL0 value, enabling
     (in part) migration of VMs between heterogenous hardware

   - Fixes + improvements to pKVM's FF-A proxy, adding support for v1.1
     of the protocol

   - FPSIMD/SVE support for nested, including merged trap configuration
     and exception routing

   - New command-line parameter to control the WFx trap behavior under
     KVM

   - Introduce kCFI hardening in the EL2 hypervisor

   - Fixes + cleanups for handling presence/absence of FEAT_TCRX

   - Miscellaneous fixes + documentation updates

  LoongArch:

   - Add paravirt steal time support

   - Add support for KVM_DIRTY_LOG_INITIALLY_SET

   - Add perf kvm-stat support for loongarch

  RISC-V:

   - Redirect AMO load/store access fault traps to guest

   - perf kvm stat support

   - Use guest files for IMSIC virtualization, when available

  s390:

   - Assortment of tiny fixes which are not time critical

  x86:

   - Fixes for Xen emulation

   - Add a global struct to consolidate tracking of host values, e.g.
     EFER

   - Add KVM_CAP_X86_APIC_BUS_CYCLES_NS to allow configuring the
     effective APIC bus frequency, because TDX

   - Print the name of the APICv/AVIC inhibits in the relevant
     tracepoint

   - Clean up KVM's handling of vendor specific emulation to
     consistently act on "compatible with Intel/AMD", versus checking
     for a specific vendor

   - Drop MTRR virtualization, and instead always honor guest PAT on
     CPUs that support self-snoop

   - Update to the newfangled Intel CPU FMS infrastructure

   - Don't advertise IA32_PERF_GLOBAL_OVF_CTRL as an MSR-to-be-saved, as
     it reads '0' and writes from userspace are ignored

   - Misc cleanups

  x86 - MMU:

   - Small cleanups, renames and refactoring extracted from the upcoming
     Intel TDX support

   - Don't allocate kvm_mmu_page.shadowed_translation for shadow pages
     that can't hold leafs SPTEs

   - Unconditionally drop mmu_lock when allocating TDP MMU page tables
     for eager page splitting, to avoid stalling vCPUs when splitting
     huge pages

   - Bug the VM instead of simply warning if KVM tries to split a SPTE
     that is non-present or not-huge. KVM is guaranteed to end up in a
     broken state because the callers fully expect a valid SPTE, it's
     all but dangerous to let more MMU changes happen afterwards

  x86 - AMD:

   - Make per-CPU save_area allocations NUMA-aware

   - Force sev_es_host_save_area() to be inlined to avoid calling into
     an instrumentable function from noinstr code

   - Base support for running SEV-SNP guests. API-wise, this includes a
     new KVM_X86_SNP_VM type, encrypting/measure the initial image into
     guest memory, and finalizing it before launching it. Internally,
     there are some gmem/mmu hooks needed to prepare gmem-allocated
     pages before mapping them into guest private memory ranges

     This includes basic support for attestation guest requests, enough
     to say that KVM supports the GHCB 2.0 specification

     There is no support yet for loading into the firmware those signing
     keys to be used for attestation requests, and therefore no need yet
     for the host to provide certificate data for those keys.

     To support fetching certificate data from userspace, a new KVM exit
     type will be needed to handle fetching the certificate from
     userspace.

     An attempt to define a new KVM_EXIT_COCO / KVM_EXIT_COCO_REQ_CERTS
     exit type to handle this was introduced in v1 of this patchset, but
     is still being discussed by community, so for now this patchset
     only implements a stub version of SNP Extended Guest Requests that
     does not provide certificate data

  x86 - Intel:

   - Remove an unnecessary EPT TLB flush when enabling hardware

   - Fix a series of bugs that cause KVM to fail to detect nested
     pending posted interrupts as valid wake eents for a vCPU executing
     HLT in L2 (with HLT-exiting disable by L1)

   - KVM: x86: Suppress MMIO that is triggered during task switch
     emulation

     Explicitly suppress userspace emulated MMIO exits that are
     triggered when emulating a task switch as KVM doesn't support
     userspace MMIO during complex (multi-step) emulation

     Silently ignoring the exit request can result in the
     WARN_ON_ONCE(vcpu->mmio_needed) firing if KVM exits to userspace
     for some other reason prior to purging mmio_needed

     See commit 0dc902267c ("KVM: x86: Suppress pending MMIO write
     exits if emulator detects exception") for more details on KVM's
     limitations with respect to emulated MMIO during complex emulator
     flows

  Generic:

   - Rename the AS_UNMOVABLE flag that was introduced for KVM to
     AS_INACCESSIBLE, because the special casing needed by these pages
     is not due to just unmovability (and in fact they are only
     unmovable because the CPU cannot access them)

   - New ioctl to populate the KVM page tables in advance, which is
     useful to mitigate KVM page faults during guest boot or after live
     migration. The code will also be used by TDX, but (probably) not
     through the ioctl

   - Enable halt poll shrinking by default, as Intel found it to be a
     clear win

   - Setup empty IRQ routing when creating a VM to avoid having to
     synchronize SRCU when creating a split IRQCHIP on x86

   - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with
     a flag that arch code can use for hooking both sched_in() and
     sched_out()

   - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
     truncating a bogus value from userspace, e.g. to help userspace
     detect bugs

   - Mark a vCPU as preempted if and only if it's scheduled out while in
     the KVM_RUN loop, e.g. to avoid marking it preempted and thus
     writing guest memory when retrieving guest state during live
     migration blackout

  Selftests:

   - Remove dead code in the memslot modification stress test

   - Treat "branch instructions retired" as supported on all AMD Family
     17h+ CPUs

   - Print the guest pseudo-RNG seed only when it changes, to avoid
     spamming the log for tests that create lots of VMs

   - Make the PMU counters test less flaky when counting LLC cache
     misses by doing CLFLUSH{OPT} in every loop iteration"

* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm: (227 commits)
  crypto: ccp: Add the SNP_VLEK_LOAD command
  KVM: x86/pmu: Add kvm_pmu_call() to simplify static calls of kvm_pmu_ops
  KVM: x86: Introduce kvm_x86_call() to simplify static calls of kvm_x86_ops
  KVM: x86: Replace static_call_cond() with static_call()
  KVM: SEV: Provide support for SNP_EXTENDED_GUEST_REQUEST NAE event
  x86/sev: Move sev_guest.h into common SEV header
  KVM: SEV: Provide support for SNP_GUEST_REQUEST NAE event
  KVM: x86: Suppress MMIO that is triggered during task switch emulation
  KVM: x86/mmu: Clean up make_huge_page_split_spte() definition and intro
  KVM: x86/mmu: Bug the VM if KVM tries to split a !hugepage SPTE
  KVM: selftests: x86: Add test for KVM_PRE_FAULT_MEMORY
  KVM: x86: Implement kvm_arch_vcpu_pre_fault_memory()
  KVM: x86/mmu: Make kvm_mmu_do_page_fault() return mapped level
  KVM: x86/mmu: Account pf_{fixed,emulate,spurious} in callers of "do page fault"
  KVM: x86/mmu: Bump pf_taken stat only in the "real" page fault handler
  KVM: Add KVM_PRE_FAULT_MEMORY vcpu ioctl to pre-populate guest memory
  KVM: Document KVM_PRE_FAULT_MEMORY ioctl
  mm, virt: merge AS_UNMOVABLE and AS_INACCESSIBLE
  perf kvm: Add kvm-stat for loongarch64
  LoongArch: KVM: Add PV steal time support in guest side
  ...
2024-07-20 12:41:03 -07:00
Linus Torvalds
f557af081d RISC-V Patches for the 6.11 Merge Window, Part 1
* Support for various new ISA extensions:
     * The Zve32[xf] and Zve64[xfd] sub-extensios of the vector
       extension.
     * Zimop and Zcmop for may-be-operations.
     * The Zca, Zcf, Zcd and Zcb sub-extensions of the C extension.
     * Zawrs,
 * riscv,cpu-intc is now dtschema.
 * A handful of performance improvements and cleanups to text patching.
 * Support for memory hot{,un}plug
 * The highest user-allocatable virtual address is now visible in
   hwprobe.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmabIGETHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiQe8D/9QPCaOnoP5OCZbwjkRBwaVxyknNyD0
 l+YNXk7Jk3B/oaOv3d7Bz+uWt1SG4j4jkfyuGJ81StZykp4/R7T823TZrPhog9VX
 IJm580MtvE49I2i1qJ+ZQti9wpiM+80lFnyMCzY6S7rrM9m62tKgUpARZcWoA55P
 iUo5bku99TYCcU2k1pnPrNSPQvVpECpv7tG0PwKpQd5DiYjbPp+aw5cQWN+izdOB
 6raOZ0buzP7McszvO/gcJs+kuHwrp0JSRvNxc2pwYZ0lx00p3hSV8UdtIMlI9Qm/
 z3gkQGHwc6UVMPHo1x0Gr5ShUTCI/iSwy4/7aY4NNXF6Sj99b8alt9GcbYqNAE7V
 k7sibCR7dhL4ods/GFMmzR7cQYlwlwtO+/ILak7rXhNvA32Xy1WUABguhP9ElTmw
 1ZS2hnRv6wc7MA2V7HBamf5mPXM6HQyC3oKy3njzDSJdiGIG7aa+TOfRAD+L/1Du
 QjIrKp6XcPIsZNjh8H3nMDVJ0VvDNnS4d4LbfNQc23VPzf57kFUqbli1pS0hBjFT
 ELEItH9dgSx+T5Qebdy/QMC3RG8Yc1IUdw6VQ7Jny/uCCEZNq+VZ+bXxspMmswCp
 sUIyDplJTJfRt3G2OxK0b95x6oj8jbaJOQfv6PBF71dDBsChg8eXFVJ2NDrX4Bvr
 h2MPK7vGBtFz8w==
 =+ICi
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.11-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V updates from Palmer Dabbelt:

 - Support for various new ISA extensions:
     * The Zve32[xf] and Zve64[xfd] sub-extensios of the vector
       extension
     * Zimop and Zcmop for may-be-operations
     * The Zca, Zcf, Zcd and Zcb sub-extensions of the C extension
     * Zawrs

 - riscv,cpu-intc is now dtschema

 - A handful of performance improvements and cleanups to text patching

 - Support for memory hot{,un}plug

 - The highest user-allocatable virtual address is now visible in
   hwprobe

* tag 'riscv-for-linus-6.11-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (58 commits)
  riscv: lib: relax assembly constraints in hweight
  riscv: set trap vector earlier
  KVM: riscv: selftests: Add Zawrs extension to get-reg-list test
  KVM: riscv: Support guest wrs.nto
  riscv: hwprobe: export Zawrs ISA extension
  riscv: Add Zawrs support for spinlocks
  dt-bindings: riscv: Add Zawrs ISA extension description
  riscv: Provide a definition for 'pause'
  riscv: hwprobe: export highest virtual userspace address
  riscv: Improve sbi_ecall() code generation by reordering arguments
  riscv: Add tracepoints for SBI calls and returns
  riscv: Optimize crc32 with Zbc extension
  riscv: Enable DAX VMEMMAP optimization
  riscv: mm: Add support for ZONE_DEVICE
  virtio-mem: Enable virtio-mem for RISC-V
  riscv: Enable memory hotplugging for RISC-V
  riscv: mm: Take memory hotplug read-lock during kernel page table dump
  riscv: mm: Add memory hotplugging support
  riscv: mm: Add pfn_to_kaddr() implementation
  riscv: mm: Refactor create_linear_mapping_range() for memory hot add
  ...
2024-07-20 09:11:27 -07:00
Zhang Bingwu
af7925d820 kbuild: Abort make on install failures
Setting '-e' flag tells shells to exit with error exit code immediately
after any of commands fails, and causes make(1) to regard recipes as
failed.

Before this, make will still continue to succeed even after the
installation failed, for example, for insufficient permission or
directory does not exist.

Signed-off-by: Zhang Bingwu <xtexchooser@duck.com>
Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-20 13:34:54 +09:00
Linus Torvalds
aba9753c06 TTY/Serial updates for 6.11-rc1
Here is a small set of tty and serial driver updates for 6.11-rc1.  Not
 much happened this cycle, unlike the previous kernel release which had
 lots of "excitement" in this part of the kernel.  Included in here are
 the following changes:
   - dt binding updates for new platforms
   - 8250 driver updates
   - various small serial driver fixes and updates
   - printk/console naming and matching attempt #2 (was reverted for
     6.10-final, should be good to go this time around, acked by the
     relevant maintainers).
 
 All of these have been in linux-next for a while with no reported
 issues.
 
 Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 -----BEGIN PGP SIGNATURE-----
 
 iG0EABECAC0WIQT0tgzFv3jCIUoxPcsxR9QN2y37KQUCZppbCQ8cZ3JlZ0Brcm9h
 aC5jb20ACgkQMUfUDdst+ymV1ACeIY5kgipqY7w4d3/7PcpKMiftrisAn0hr6csj
 Gan+k3cuVGlasGkaQ5/B
 =35VK
 -----END PGP SIGNATURE-----

Merge tag 'tty-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty

Pull tty / serial updates from Greg KH:
 "Here is a small set of tty and serial driver updates for 6.11-rc1. Not
  much happened this cycle, unlike the previous kernel release which had
  lots of "excitement" in this part of the kernel. Included in here are
  the following changes:

   - dt binding updates for new platforms

   - 8250 driver updates

   - various small serial driver fixes and updates

   - printk/console naming and matching attempt #2 (was reverted for
     6.10-final, should be good to go this time around, acked by the
     relevant maintainers).

  All of these have been in linux-next for a while with no reported
  issues"

* tag 'tty-6.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty: (22 commits)
  Documentation: kernel-parameters: Add DEVNAME:0.0 format for serial ports
  serial: core: Add serial_base_match_and_update_preferred_console()
  printk: Add match_devname_and_update_preferred_console()
  serial: sc16is7xx: hardware reset chip if reset-gpios is defined in DT
  dt-bindings: serial: sc16is7xx: add reset-gpios
  dt-bindings: serial: vt8500-uart: convert to json-schema
  serial: 8250_platform: Explicitly show we initialise ISA ports only once
  tty: add missing MODULE_DESCRIPTION() macros
  dt-bindings: serial: mediatek,uart: add MT7988
  serial: sh-sci: Add support for RZ/V2H(P) SoC
  dt-bindings: serial: Add documentation for Renesas RZ/V2H(P) (R9A09G057) SCIF support
  dt-bindings: serial: renesas,scif: Make 'interrupt-names' property as required
  dt-bindings: serial: renesas,scif: Validate 'interrupts' and 'interrupt-names'
  dt-bindings: serial: renesas,scif: Move ref for serial.yaml at the end
  riscv: dts: starfive: jh7110: Add the core reset and jh7110 compatible for uarts
  serial: 8250_dw: Use reset array API to get resets
  dt-bindings: serial: snps-dw-apb-uart: Add one more reset signal for StarFive JH7110 SoC
  serial: 8250: Extract platform driver
  serial: 8250: Extract RSA bits
  serial: imx: stop casting struct uart_port to struct imx_port
  ...
2024-07-19 15:22:14 -07:00
Linus Torvalds
70045bfc4c ftrace: Rewrite of function graph tracer
Up until now, the function graph tracer could only have a single user
 attached to it. If another user tried to attach to the function graph
 tracer while one was already attached, it would fail. Allowing function
 graph tracer to have more than one user has been asked for since 2009, but
 it required a rewrite to the logic to pull it off so it never happened.
 Until now!
 
 There's three systems that trace the return of a function. That is
 kretprobes, function graph tracer, and BPF. kretprobes and function graph
 tracing both do it similarly. The difference is that kretprobes uses a
 shadow stack per callback and function graph tracer creates a shadow stack
 for all tasks. The function graph tracer method makes it possible to trace
 the return of all functions. As kretprobes now needs that feature too,
 allowing it to use function graph tracer was needed. BPF also wants to
 trace the return of many probes and its method doesn't scale either.
 Having it use function graph tracer would improve that.
 
 By allowing function graph tracer to have multiple users allows both
 kretprobes and BPF to use function graph tracer in these cases. This will
 allow kretprobes code to be removed in the future as it's version will no
 longer be needed. Note, function graph tracer is only limited to 16
 simultaneous users, due to shadow stack size and allocated slots.
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCZpbWlxQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qgtvAP9jxmgEiEhz4Bpe1vRKVSMYK6ozXHTT
 7MFKRMeQqQ8zeAEA2sD5Zrt9l7zKzg0DFpaDLgc3/yh14afIDxzTlIvkmQ8=
 =umuf
 -----END PGP SIGNATURE-----

Merge tag 'ftrace-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace

Pull ftrace updates from Steven Rostedt:
 "Rewrite of function graph tracer to allow multiple users

  Up until now, the function graph tracer could only have a single user
  attached to it. If another user tried to attach to the function graph
  tracer while one was already attached, it would fail. Allowing
  function graph tracer to have more than one user has been asked for
  since 2009, but it required a rewrite to the logic to pull it off so
  it never happened. Until now!

  There's three systems that trace the return of a function. That is
  kretprobes, function graph tracer, and BPF. kretprobes and function
  graph tracing both do it similarly. The difference is that kretprobes
  uses a shadow stack per callback and function graph tracer creates a
  shadow stack for all tasks. The function graph tracer method makes it
  possible to trace the return of all functions. As kretprobes now needs
  that feature too, allowing it to use function graph tracer was needed.
  BPF also wants to trace the return of many probes and its method
  doesn't scale either. Having it use function graph tracer would
  improve that.

  By allowing function graph tracer to have multiple users allows both
  kretprobes and BPF to use function graph tracer in these cases. This
  will allow kretprobes code to be removed in the future as it's version
  will no longer be needed.

  Note, function graph tracer is only limited to 16 simultaneous users,
  due to shadow stack size and allocated slots"

* tag 'ftrace-v6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace: (49 commits)
  fgraph: Use str_plural() in test_graph_storage_single()
  function_graph: Add READ_ONCE() when accessing fgraph_array[]
  ftrace: Add missing kerneldoc parameters to unregister_ftrace_direct()
  function_graph: Everyone uses HAVE_FUNCTION_GRAPH_RET_ADDR_PTR, remove it
  function_graph: Fix up ftrace_graph_ret_addr()
  function_graph: Make fgraph_update_pid_func() a stub for !DYNAMIC_FTRACE
  function_graph: Rename BYTE_NUMBER to CHAR_NUMBER in selftests
  fgraph: Remove some unused functions
  ftrace: Hide one more entry in stack trace when ftrace_pid is enabled
  function_graph: Do not update pid func if CONFIG_DYNAMIC_FTRACE not enabled
  function_graph: Make fgraph_do_direct static key static
  ftrace: Fix prototypes for ftrace_startup/shutdown_subops()
  ftrace: Assign RCU list variable with rcu_assign_ptr()
  ftrace: Assign ftrace_list_end to ftrace_ops_list type cast to RCU
  ftrace: Declare function_trace_op in header to quiet sparse warning
  ftrace: Add comments to ftrace_hash_move() and friends
  ftrace: Convert "inc" parameter to bool in ftrace_hash_rec_update_modify()
  ftrace: Add comments to ftrace_hash_rec_disable/enable()
  ftrace: Remove "filter_hash" parameter from __ftrace_hash_rec_update()
  ftrace: Rename dup_hash() and comment it
  ...
2024-07-18 13:36:33 -07:00
Linus Torvalds
51835949dd Networking changes for 6.11. Not much excitement - a handful of large
patchsets (devmem among them) did not make it in time.
 
 Core & protocols
 ----------------
 
  - Use local_lock in addition to local_bh_disable() to protect per-CPU
    resources in networking, a step closer for local_bh_disable() not
    to act as a big lock on PREEMPT_RT.
 
  - Use flex array for netdevice priv area, ensure its cache alignment.
 
  - Add a sysctl knob to allow user to specify a default rto_min at socket
    init time. Bit of a big hammer but multiple companies were
    independently carrying such patch downstream so clearly it's useful.
 
  - Support scheduling transmission of packets based on CLOCK_TAI.
 
  - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned off
    using cpusets.
 
  - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address.
 
  - Allow configuration of multipath hash seed, to both allow synchronizing
    hashing of two routers, and preventing partial accidental sync.
 
  - Improve TCP compliance with RFC 9293 for simultaneous connect().
 
  - Support sending NAT keepalives in IPsec ESP in UDP states. Userspace
    IKE daemon had to do this before, but the kernel can better keep
    track of it.
 
  - Support sending supervision HSR frames with MAC addresses stored in
    ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled.
 
  - Introduce IPPROTO_SMC for selecting SMC when socket is created.
 
  - Allow UDP GSO transmit from devices with no checksum offload.
 
  - openvswitch: add packet sampling via psample, separating the sampled
    traffic from "upcall" packets sent to user space for forwarding.
 
  - nf_tables: shrink memory consumption for transaction objects.
 
 Things we sprinkled into general kernel code
 --------------------------------------------
 
  - Power Sequencing subsystem (used by Qualcomm Bluetooth driver
    for QCA6390).
 
  - Add IRQ information in sysfs for auxiliary bus.
 
  - Introduce guard definition for local_lock.
 
  - Add aligned flavor of __cacheline_group_{begin, end}() markings for
    grouping fields in structures.
 
 BPF
 ---
 
  - Notify user space (via epoll) when a struct_ops object is getting
    detached/unregistered.
 
  - Add new kfuncs for a generic, open-coded bits iterator.
 
  - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and
    bpf_list_head.
 
  - Support resilient split BTF which cuts down on duplication and makes
    BTF as compact as possible WRT BTF from modules.
 
  - Add support for dumping kfunc prototypes from BTF which enables both
    detecting as well as dumping compilable prototypes for kfuncs.
 
  - riscv64 BPF JIT improvements in particular to add 12-argument support
    for BPF trampolines and to utilize bpf_prog_pack for the latter.
 
  - Add the capability to offload the netfilter flowtable in XDP layer
    through kfuncs.
 
 Driver API
 ----------
 
  - Allow users to configure IRQ tresholds between which automatic IRQ
    moderation can choose.
 
  - Expand Power Sourcing (PoE) status with power, class and failure
    reason. Support setting power limits.
 
  - Track additional RSS contexts in the core, make sure configuration
    changes don't break them.
 
  - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated ESP
    data paths.
 
  - Support updating firmware on SFP modules.
 
 Tests and tooling
 -----------------
 
  - mptcp: use net/lib.sh to manage netns.
 
  - TCP-AO and TCP-MD5: replace debug prints used by tests with
    tracepoints.
 
  - openvswitch: make test self-contained (don't depend on OvS CLI tools).
 
 Drivers
 -------
 
  - Ethernet high-speed NICs:
    - Broadcom (bnxt):
      - increase the max total outstanding PTP TX packets to 4
      - add timestamping statistics support
      - implement netdev_queue_mgmt_ops
      - support new RSS context API
    - Intel (100G, ice, idpf):
      - implement FEC statistics and dumping signal quality indicators
      - support E825C products (with 56Gbps PHYs)
    - nVidia/Mellanox:
      - support HW-GRO
      - mlx4/mlx5: support per-queue statistics via netlink
      - obey the max number of EQs setting in sub-functions
    - AMD/Solarflare:
      - support new RSS context API
    - AMD/Pensando:
      - ionic: rework fix for doorbell miss to lower overhead
        and skip it on new HW
    - Wangxun:
      - txgbe: support Flow Director perfect filters
 
  - Ethernet NICs consumer, embedded and virtual:
    - Add driver for Tehuti Networks TN40xx chips
    - Add driver for Meta's internal NIC chips
    - Add driver for Ethernet MAC on Airoha EN7581 SoCs
    - Add driver for Renesas Ethernet-TSN devices
    - Google cloud vNIC:
      - flow steering support
    - Microsoft vNIC:
      - support page sizes other than 4KB on ARM64
    - vmware vNIC:
      - support latency measurement (update to version 9)
    - VirtIO net:
      - support for Byte Queue Limits
      - support configuring thresholds for automatic IRQ moderation
      - support for AF_XDP Rx zero-copy
    - Synopsys (stmmac):
      - support for STM32MP13 SoC
      - let platforms select the right PCS implementation
    - TI:
      - icssg-prueth: add multicast filtering support
      - icssg-prueth: enable PTP timestamping and PPS
    - Renesas:
      - ravb: improve Rx performance 30-400% by using page pool,
        theaded NAPI and timer-based IRQ coalescing
      - ravb: add MII support for R-Car V4M
    - Cadence (macb):
      - macb: add ARP support to Wake-On-LAN
    - Cortina:
      - use phylib for RX and TX pause configuration
 
  - Ethernet switches:
    - nVidia/Mellanox:
      - support configuration of multipath hash seed
      - report more accurate max MTU
      - use page_pool to improve Rx performance
    - MediaTek:
      - mt7530: add support for bridge port isolation
    - Qualcomm:
      - qca8k: add support for bridge port isolation
    - Microchip:
      - lan9371/2: add 100BaseTX PHY support
    - NXP:
      - vsc73xx: implement VLAN operations
 
  - Ethernet PHYs:
    - aquantia: enable support for aqr115c
    - aquantia: add support for PHY LEDs
    - realtek: add support for rtl8224 2.5Gbps PHY
    - xpcs: add memory-mapped device support
    - add BroadR-Reach link mode and support in Broadcom's PHY driver
 
  - CAN:
    - add document for ISO 15765-2 protocol support
    - mcp251xfd: workaround for erratum DS80000789E, use timestamps
      to catch when device returns incorrect FIFO status
 
  - WiFi:
    - mac80211/cfg80211:
      - parse Transmit Power Envelope (TPE) data in mac80211 instead of
        in drivers
      - improvements for 6 GHz regulatory flexibility
      - multi-link improvements
      - support multiple radios per wiphy
      - remove DEAUTH_NEED_MGD_TX_PREP flag
    - Intel (iwlwifi):
      - bump FW API to 91 for BZ/SC devices
      - report 64-bit radiotap timestamp
      - enable P2P low latency by default
      - handle Transmit Power Envelope (TPE) advertised by AP
      - remove support for older FW for new devices
      - fast resume (keeping the device configured)
      - mvm: re-enable Multi-Link Operation (MLO)
      - aggregation (A-MSDU) optimizations
    - MediaTek (mt76):
      - mt7925 Multi-Link Operation (MLO) support
    - Qualcomm (ath10k):
      - LED support for various chipsets
    - Qualcomm (ath12k):
      - remove unsupported Tx monitor handling
      - support channel 2 in 6 GHz band
      - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band
      - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID
        Advertisements (EMA)
      - support dynamic VLAN
      - add panic handler for resetting the firmware state
      - DebugFS support for datapath statistics
      - WCN7850: support for Wake on WLAN
    - Microchip (wilc1000):
      - read MAC address during probe to make it visible to user space
      - suspend/resume improvements
    - TI (wl18xx):
      - support newer firmware versions
    - RealTek (rtw89):
      - preparation for RTL8852BE-VT support
      - Wake on WLAN support for WiFi 6 chips
      - 36-bit PCI DMA support
    - RealTek (rtlwifi):
      - RTL8192DU support
    - Broadcom (brcmfmac):
      - Management Frame Protection support (to enable WPA3)
 
  - Bluetooth:
    - qualcomm: use the power sequencer for QCA6390
    - btusb: mediatek: add ISO data transmission functions
    - hci_bcm4377: add BCM4388 support
    - btintel: add support for BlazarU core
    - btintel: add support for Whale Peak2
    - btnxpuart: add support for AW693 A1 chipset
    - btnxpuart: add support for IW615 chipset
    - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591
 
 Signed-off-by: Jakub Kicinski <kuba@kernel.org>
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE6jPA+I1ugmIBA4hXMUZtbf5SIrsFAmaWjBwACgkQMUZtbf5S
 IrvuSRAAkJuEzTRqgURBCe4eNEQde6mJJig7l2CKHwCbFiHZpRkFHf8qKbcGWbL6
 uLW33SWnKtJVDhxVKWHLq635XW7BAa80YhqGw21GDi+mIEhWXZglHj3xbXNxsMfE
 4eg/kG4BkfYWFmHaXOwVWV/mr7nXf6j7WmXNeXEi32ufE1j0OL+YlQenKnMj8yP2
 j9JmYa2Chwppng1SblHmcjmGkdNVwFhStKeCG+2K7v06wdDH/QYBlbgUv9gw/cxp
 NlW//wgiaeX40U4O3kDwt9C+LDoh+0VrDDeVdQ+IsScLtY3PhAzEoKolFYTq2HSr
 I1JpoaHNnyNsJq3DZrACQ5WlH4yDn6C2EUB6dxNnFaI9F1ZPsi+7MTl6Sei1AklD
 TuQTj/lxOACBwW2Q77NU72uoxiIUauesGPHcnrAFuoCIEhZF0mso7k59BvrXhsOP
 QwcLbQdc1YHNkqv/Vc7NBY+ruMsYB+5Ubbhhj2p27dp/CWFIwxI29fze4dn2uhO6
 ejHN3mbqwPdSzg12YJtM6Iq61Cnwo2eVSvhTxl+ZVSZtI4nu2arzR+y7QTYmNrXP
 6tkgVN9UsWeLl2xJ8wyyqL5mcvNHP2rPXWZ2X56iTaa26m+UlleeQ7YRaYtQAAr0
 Ec/vlDMX64SwHhd+qwE99DXGQf2g+KklHKSLsnajJUVrWFTlRI0=
 =opz8
 -----END PGP SIGNATURE-----

Merge tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next

Pull networking updates from Jakub Kicinski:
 "Not much excitement - a handful of large patchsets (devmem among them)
  did not make it in time.

  Core & protocols:

   - Use local_lock in addition to local_bh_disable() to protect per-CPU
     resources in networking, a step closer for local_bh_disable() not
     to act as a big lock on PREEMPT_RT

   - Use flex array for netdevice priv area, ensure its cache alignment

   - Add a sysctl knob to allow user to specify a default rto_min at
     socket init time. Bit of a big hammer but multiple companies were
     independently carrying such patch downstream so clearly it's useful

   - Support scheduling transmission of packets based on CLOCK_TAI

   - Un-pin TCP TIMEWAIT timer to avoid it firing on CPUs later cordoned
     off using cpusets

   - Support multiple L2TPv3 UDP tunnels using the same 5-tuple address

   - Allow configuration of multipath hash seed, to both allow
     synchronizing hashing of two routers, and preventing partial
     accidental sync

   - Improve TCP compliance with RFC 9293 for simultaneous connect()

   - Support sending NAT keepalives in IPsec ESP in UDP states.
     Userspace IKE daemon had to do this before, but the kernel can
     better keep track of it

   - Support sending supervision HSR frames with MAC addresses stored in
     ProxyNodeTable when RedBox (i.e. HSR-SAN) is enabled

   - Introduce IPPROTO_SMC for selecting SMC when socket is created

   - Allow UDP GSO transmit from devices with no checksum offload

   - openvswitch: add packet sampling via psample, separating the
     sampled traffic from "upcall" packets sent to user space for
     forwarding

   - nf_tables: shrink memory consumption for transaction objects

  Things we sprinkled into general kernel code:

   - Power Sequencing subsystem (used by Qualcomm Bluetooth driver for
     QCA6390)           [ Already merged separately - Linus ]

   - Add IRQ information in sysfs for auxiliary bus

   - Introduce guard definition for local_lock

   - Add aligned flavor of __cacheline_group_{begin, end}() markings for
     grouping fields in structures

  BPF:

   - Notify user space (via epoll) when a struct_ops object is getting
     detached/unregistered

   - Add new kfuncs for a generic, open-coded bits iterator

   - Enable BPF programs to declare arrays of kptr, bpf_rb_root, and
     bpf_list_head

   - Support resilient split BTF which cuts down on duplication and
     makes BTF as compact as possible WRT BTF from modules

   - Add support for dumping kfunc prototypes from BTF which enables
     both detecting as well as dumping compilable prototypes for kfuncs

   - riscv64 BPF JIT improvements in particular to add 12-argument
     support for BPF trampolines and to utilize bpf_prog_pack for the
     latter

   - Add the capability to offload the netfilter flowtable in XDP layer
     through kfuncs

  Driver API:

   - Allow users to configure IRQ tresholds between which automatic IRQ
     moderation can choose

   - Expand Power Sourcing (PoE) status with power, class and failure
     reason. Support setting power limits

   - Track additional RSS contexts in the core, make sure configuration
     changes don't break them

   - Support IPsec crypto offload for IPv6 ESP and IPv4 UDP-encapsulated
     ESP data paths

   - Support updating firmware on SFP modules

  Tests and tooling:

   - mptcp: use net/lib.sh to manage netns

   - TCP-AO and TCP-MD5: replace debug prints used by tests with
     tracepoints

   - openvswitch: make test self-contained (don't depend on OvS CLI
     tools)

  Drivers:

   - Ethernet high-speed NICs:
      - Broadcom (bnxt):
         - increase the max total outstanding PTP TX packets to 4
         - add timestamping statistics support
         - implement netdev_queue_mgmt_ops
         - support new RSS context API
      - Intel (100G, ice, idpf):
         - implement FEC statistics and dumping signal quality indicators
         - support E825C products (with 56Gbps PHYs)
      - nVidia/Mellanox:
         - support HW-GRO
         - mlx4/mlx5: support per-queue statistics via netlink
         - obey the max number of EQs setting in sub-functions
      - AMD/Solarflare:
         - support new RSS context API
      - AMD/Pensando:
         - ionic: rework fix for doorbell miss to lower overhead and
           skip it on new HW
      - Wangxun:
         - txgbe: support Flow Director perfect filters

   - Ethernet NICs consumer, embedded and virtual:
      - Add driver for Tehuti Networks TN40xx chips
      - Add driver for Meta's internal NIC chips
      - Add driver for Ethernet MAC on Airoha EN7581 SoCs
      - Add driver for Renesas Ethernet-TSN devices
      - Google cloud vNIC:
         - flow steering support
      - Microsoft vNIC:
         - support page sizes other than 4KB on ARM64
      - vmware vNIC:
         - support latency measurement (update to version 9)
      - VirtIO net:
         - support for Byte Queue Limits
         - support configuring thresholds for automatic IRQ moderation
         - support for AF_XDP Rx zero-copy
      - Synopsys (stmmac):
         - support for STM32MP13 SoC
         - let platforms select the right PCS implementation
      - TI:
         - icssg-prueth: add multicast filtering support
         - icssg-prueth: enable PTP timestamping and PPS
      - Renesas:
         - ravb: improve Rx performance 30-400% by using page pool,
           theaded NAPI and timer-based IRQ coalescing
         - ravb: add MII support for R-Car V4M
      - Cadence (macb):
         - macb: add ARP support to Wake-On-LAN
      - Cortina:
         - use phylib for RX and TX pause configuration

   - Ethernet switches:
      - nVidia/Mellanox:
         - support configuration of multipath hash seed
         - report more accurate max MTU
         - use page_pool to improve Rx performance
      - MediaTek:
         - mt7530: add support for bridge port isolation
      - Qualcomm:
         - qca8k: add support for bridge port isolation
      - Microchip:
         - lan9371/2: add 100BaseTX PHY support
      - NXP:
         - vsc73xx: implement VLAN operations

   - Ethernet PHYs:
      - aquantia: enable support for aqr115c
      - aquantia: add support for PHY LEDs
      - realtek: add support for rtl8224 2.5Gbps PHY
      - xpcs: add memory-mapped device support
      - add BroadR-Reach link mode and support in Broadcom's PHY driver

   - CAN:
      - add document for ISO 15765-2 protocol support
      - mcp251xfd: workaround for erratum DS80000789E, use timestamps to
        catch when device returns incorrect FIFO status

   - WiFi:
      - mac80211/cfg80211:
         - parse Transmit Power Envelope (TPE) data in mac80211 instead
           of in drivers
         - improvements for 6 GHz regulatory flexibility
         - multi-link improvements
         - support multiple radios per wiphy
         - remove DEAUTH_NEED_MGD_TX_PREP flag
      - Intel (iwlwifi):
         - bump FW API to 91 for BZ/SC devices
         - report 64-bit radiotap timestamp
         - enable P2P low latency by default
         - handle Transmit Power Envelope (TPE) advertised by AP
         - remove support for older FW for new devices
         - fast resume (keeping the device configured)
         - mvm: re-enable Multi-Link Operation (MLO)
         - aggregation (A-MSDU) optimizations
      - MediaTek (mt76):
         - mt7925 Multi-Link Operation (MLO) support
      - Qualcomm (ath10k):
         - LED support for various chipsets
      - Qualcomm (ath12k):
         - remove unsupported Tx monitor handling
         - support channel 2 in 6 GHz band
         - support Spatial Multiplexing Power Save (SMPS) in 6 GHz band
         - supprt multiple BSSID (MBSSID) and Enhanced Multi-BSSID
           Advertisements (EMA)
         - support dynamic VLAN
         - add panic handler for resetting the firmware state
         - DebugFS support for datapath statistics
         - WCN7850: support for Wake on WLAN
      - Microchip (wilc1000):
         - read MAC address during probe to make it visible to user space
         - suspend/resume improvements
      - TI (wl18xx):
         - support newer firmware versions
      - RealTek (rtw89):
         - preparation for RTL8852BE-VT support
         - Wake on WLAN support for WiFi 6 chips
         - 36-bit PCI DMA support
      - RealTek (rtlwifi):
         - RTL8192DU support
      - Broadcom (brcmfmac):
         - Management Frame Protection support (to enable WPA3)

   - Bluetooth:
      - qualcomm: use the power sequencer for QCA6390
      - btusb: mediatek: add ISO data transmission functions
      - hci_bcm4377: add BCM4388 support
      - btintel: add support for BlazarU core
      - btintel: add support for Whale Peak2
      - btnxpuart: add support for AW693 A1 chipset
      - btnxpuart: add support for IW615 chipset
      - btusb: add Realtek RTL8852BE support ID 0x13d3:0x3591"

* tag 'net-next-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1589 commits)
  eth: fbnic: Fix spelling mistake "tiggerring" -> "triggering"
  tcp: Replace strncpy() with strscpy()
  wifi: ath12k: fix build vs old compiler
  tcp: Don't access uninit tcp_rsk(req)->ao_keyid in tcp_create_openreq_child().
  eth: fbnic: Write the TCAM tables used for RSS control and Rx to host
  eth: fbnic: Add L2 address programming
  eth: fbnic: Add basic Rx handling
  eth: fbnic: Add basic Tx handling
  eth: fbnic: Add link detection
  eth: fbnic: Add initial messaging to notify FW of our presence
  eth: fbnic: Implement Rx queue alloc/start/stop/free
  eth: fbnic: Implement Tx queue alloc/start/stop/free
  eth: fbnic: Allocate a netdevice and napi vectors with queues
  eth: fbnic: Add FW communication mechanism
  eth: fbnic: Add message parsing for FW messages
  eth: fbnic: Add register init to set PCIe/Ethernet device config
  eth: fbnic: Allocate core device specific structures and devlink interface
  eth: fbnic: Add scaffolding for Meta's NIC driver
  PCI: Add Meta Platforms vendor ID
  net/sched: cls_flower: propagate tca[TCA_OPTIONS] to NL_REQ_ATTR_CHECK
  ...
2024-07-16 19:28:34 -07:00
Linus Torvalds
d80f2996b8 asm-generic updates for 6.11
Most of this is part of my ongoing work to clean up the system call
 tables. In this bit, all of the newer architectures are converted to
 use the machine readable syscall.tbl format instead in place of complex
 macros in include/uapi/asm-generic/unistd.h.
 
 This follows an earlier series that fixed various API mismatches
 and in turn is used as the base for planned simplifications.
 
 The other two patches are dead code removal and a warning fix.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaVB1cACgkQYKtH/8kJ
 UicMqxAAnYKOxfjoMIhYYK6bl126wg/vIcDcjIR9cNWH21Nhn3qxn11ZXau3S7xv
 3l/HreEhyEQr4gC2a70IlXyHUadYOlrk+83OURrunWk1oKPmZlMKcfPVbtp8GL7x
 PUNXQfwM1XZLveKwufY24hoZdwKC+Y/5WLc1t0ReznJuAqgeO2rM9W5dnV5bAfCp
 he3F5hFcr196Dz3/GJjJIWrY+cbwfmZWsNtj1vFTL5/r/LuCu8HTkqhsGj8tE5BJ
 NGVEEXbp5eaVTCIGqJWhnuZcsnKN9kM51M7CtdwWf8OTckUVuJap5OsDVKQkWkGl
 bLPbd2jhDltph0sah51hAIvv4WdkThW76u9FRW7KR3fo7ra67eF7l5j7wc1lE2JB
 GwLJ1X56Bxe1GhvvNTlDmb7DrnlP/DMPuRv3Z6xyH6l8iZ2pMGlnAxuw6Bs1s6Y5
 WSs36ZpnS0ctgjfx37ZITsZSvbKFPpQFJP4siwS8aRNv/NFALNNdFyOCY5lNzspZ
 0dxwjn6/7UpHE4MKh6/hvCg2QwupXXBTRytibw+75/rOsR+EYlmtuONtyq2sLUHe
 ktJ5pg+8XuZm27+wLffuluzmY7sv2F8OU4cTYeM60Ynmc6pRzwUY6/VhG52S1/mU
 Ua4VgYIpzOtlLrYmz5QTWIZpdSFSVbIc/3pLriD6hn4Mvg+BwdA=
 =XOhL
 -----END PGP SIGNATURE-----

Merge tag 'asm-generic-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic

Pull asm-generic updates from Arnd Bergmann:
 "Most of this is part of my ongoing work to clean up the system call
  tables. In this bit, all of the newer architectures are converted to
  use the machine readable syscall.tbl format instead in place of
  complex macros in include/uapi/asm-generic/unistd.h.

  This follows an earlier series that fixed various API mismatches and
  in turn is used as the base for planned simplifications.

  The other two patches are dead code removal and a warning fix"

* tag 'asm-generic-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/arnd/asm-generic:
  vmlinux.lds.h: catch .bss..L* sections into BSS")
  fixmap: Remove unused set_fixmap_offset_io()
  riscv: convert to generic syscall table
  openrisc: convert to generic syscall table
  nios2: convert to generic syscall table
  loongarch: convert to generic syscall table
  hexagon: use new system call table
  csky: convert to generic syscall table
  arm64: rework compat syscall macros
  arm64: generate 64-bit syscall.tbl
  arm64: convert unistd_32.h to syscall.tbl format
  arc: convert to generic syscall table
  clone3: drop __ARCH_WANT_SYS_CLONE3 macro
  kbuild: add syscall table generation to scripts/Makefile.asm-headers
  kbuild: verify asm-generic header list
  loongarch: avoid generating extra header files
  um: don't generate asm/bpf_perf_event.h
  csky: drop asm/gpio.h wrapper
  syscalls: add generic scripts/syscall.tbl
2024-07-16 12:09:03 -07:00
Linus Torvalds
a9a4cd9c33 soc: defconfig updates for 6.11
These are the usual updates to enable newly added drivers, mostly for
 arm64 and riscv this time.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaVOFUACgkQYKtH/8kJ
 Uiecqw/+LleDzQp7Gf2DaQUesmLwfqs6xj69GIDI69y1PLPI0bQK/hSQ3B+ZTpZP
 fF56oxBbevytgf6LzhI/Ma+RLcSWqyNw6wgzkS+pBnslCC/8zc66wU7WF53CL9LL
 Ovxo0MRKzQdCgXpu1OZFeUM38YT72yKNebjiYE502vHHfm3Q7XmGmPgCgIJqF0rE
 BSQaSCPLEJU/pFsus+gAILKn+ptgt8GUWDKRSGksJZ8oyuWB30fl+GZ4yg6TaFie
 /D3gm3Qh9uXviG7w4ks1gTvuOgoTlRAecrzxstPrQvHDaG2WCdmvI68aek7Dj/DH
 AYb6krExF58VBdTK+HhD9hxxF9nLvk58GVNfTd7FryIejv9Nw0sMWC7SGoYkZShF
 bSkuX2cxy6S0+ejvJ+HOkmLFAXY2D3ScuVpIEaoE3V+R017Izq5vyYWdvZv2CyMv
 zv9wZW6oiDMDRyVbnrw2dZ4JQwna3XWf3ffr8Ds6NNJt4e07npPlkmlwsUfZEkMl
 yQAtUBeeGBDT3GKP2ZyaexHbIwXro6EcICa1hupyLZwR3RSiU7ZsNL98lKW9nTkJ
 2ZK5f9vKAat+yFZ+XfZS+khA2qz2gtrykonHYZvXcV7FcM8UmpvlO22gliwRR4VI
 JOwu+biwM6hnGoJc7WyC5YS2rwcOZZ1JIv6vvUDznOZ3pLKFb7E=
 =OqPR
 -----END PGP SIGNATURE-----

Merge tag 'soc-defconfig-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC defconfig updates from Arnd Bergmann:
 "These are the usual updates to enable newly added drivers, mostly for
  arm64 and riscv this time"

* tag 'soc-defconfig-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  arm64: defconfig: Enable the IWLWIFI driver
  ARM: multi_v7_defconfig: Add MCP23S08 pinctrl support
  arm64: defconfig: Enable NVIDIA CoreSight PMU driver
  arm64: defconfig: enable SHM Bridge support for the TZ memory allocator
  arm64: defconfig: Enable secure QFPROM driver
  ARM: imx_v6_v7_defconfig: enable DRM_SII902X and DRM_DISPLAY_CONNECTOR
  ARM: imx_v6_v7_defconfig: Enable drivers for TQMa7x/MBa7x
  riscv: defconfig: Enable StarFive JH7110 drivers
  arm64: defconfig: Enable TI LP873X PMIC
  arm64: defconfig: Enable USB2 PHY Driver
  arm64: defconfig: Enable MTD support for Hyperbus
  ARM: configs: at91: Enable LVDS serializer support
  arm64: defconfig: enable several Qualcomm interconnects
  arm64: defconfig: Enable Marvell 88Q2XXX PHY support
  arm64: defconfig: make CONFIG_INTERCONNECT_QCOM_SM8350 built-in
  arm64: defconfig: enable CONFIG_SM_GPUCC_8350
  arm64: defconfig: Enable Renesas R-Car Gen4 PCIe controller
2024-07-16 11:56:15 -07:00
Linus Torvalds
e3950967f6 soc: dt updates for 6.11
The devicetree updates are fairly well spread out across platforms,
 with Qualcomm making up about a third of the total.
 
 There are three new SoCs in existing product families this:
 
  - NXP i.MX95 is a variant of i.MX93, now with six Cortex-A55 cores
    instead of just two as well as a GPU and more high-speed I/O
    devices.
 
  - Qualcomm QCS8550 is a variant of SM8550 for IOT devices
 
  - Airoha EN7581 is a 10G-PON network chip and related to
    the MT7981 Wireless router chip from its parent Mediatek.
 
 In total there are 58 new machines, including four riscv
 boards and eight for 32-bit arm.
 
 The most exciting new addition is probably a pair of laptops
 based on the Qualcomm x1e80100 (Snapdragon X1 Elite) chip,
 the Asus Vivobook S15 and the Lenovo Yoga Slim7x.
 
 Other noteworthy new additions are:
 
  - A total of 20 Qualcomm based machines, mostly Android devices
    from Samsung, Motorola and LG, as well as a wireless router
    and some reference designs
 
  - Six NXP i.MX based machines, mostly industrial boards along
    with some reference designs
 
  - Mediatek sees some interesting Filogic based routers
    including the "OpenWRT One", a few new Chromebooks as
    well as single-board computers.
 
  - Four machines from Solidrun based on Marvell cn913x,
    replacing the older Armada 8000 based counterparts
 
  - The four Amlogic machines are all set top boxes or reference
    designs for them
 
  - The nine new Rockchips machines are mostly single-board
    computers including some interesting ones based on the
    rk3588 chip like the ROCK 5 ITX board and the CM3588
    with its four NVMe slots
 
  - The RISC-V boards are all single-board computers based on
    Starfive JH7110, Microchip MPFS and Allwinner D1, which all
    had similar boards already
 
 There are also a lot of updates to already supported machines,
 notably for the TI K3, Rockchips, Freescale and of course
 Qualcomm platforms.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaVTSYACgkQYKtH/8kJ
 UidZrQ/9GKrfiZ9xJ/7Vvh/jtF5uObsoVuEC2ZFNXY4q6x6KV8BxuHV6LVHgWVaS
 3+Mp5ER1N+h13cB8aDNQ9lq/TYfINQrAGFPMWK2Ytkg57klqeCblfSiKuQxIfdmG
 SH146R3NPe6lqEZ9yv8KWr1GS8kkkVFgzcOBD2BPwx77elazBvG4Ff5rd3Nizua2
 aAcrO2tKHMOJz4eUOJNvrDppwBZUARwPlScBx+QrJWUIDvjRafGvmwSp80FEQorz
 k258DeBzn3JiHUtvE5MLsaBC1WNghV5WTujEI+SLd5T0XohSr5Y8oisSnn/9fAn4
 CCji0eeeqG/KfIWzEGvs7AKmym1oW1OpdbLRN601YSNxLS7mLE5gEySjFXR3dYje
 IxbYzDV9A8qst/znk+uR6be8YB9r7r+aYi4IlE4lg9xWripTOPNuCx/5tdfa2Ge6
 +fBs4WBz+t0Xba19VjonaP+6HsEPqC2LP0/D44QMktG7QRrYbqILX66Mg/jgPccM
 f167D9WGcWUwoKH2nDZ+m1oXQj0UkSge40gBOFRtGfdCsV77TssmGeq0OeDDSA9K
 bIQgaDVwZuYXr9kyNoYIqziU0JA+mhALLiaAVaMLS8+VcNXRZKscv3fs+yFgCGFy
 aDkqWw6j2M3/O93+t4j4He/KNglquA81DBT8ZZPV1KJ4flTQIk0=
 =xGqj
 -----END PGP SIGNATURE-----

Merge tag 'soc-dt-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC dt updates from Arnd Bergmann:
 "The devicetree updates are fairly well spread out across platforms,
  with Qualcomm making up about a third of the total.

  There are three new SoCs in existing product families this:

   - NXP i.MX95 is a variant of i.MX93, now with six Cortex-A55 cores
     instead of just two as well as a GPU and more high-speed I/O
     devices.

   - Qualcomm QCS8550 is a variant of SM8550 for IOT devices

   - Airoha EN7581 is a 10G-PON network chip and related to the MT7981
     Wireless router chip from its parent Mediatek.

  In total there are 58 new machines, including four riscv boards and
  eight for 32-bit arm.

  The most exciting new addition is probably a pair of laptops based on
  the Qualcomm x1e80100 (Snapdragon X1 Elite) chip, the Asus Vivobook
  S15 and the Lenovo Yoga Slim7x.

  Other noteworthy new additions are:

   - A total of 20 Qualcomm based machines, mostly Android devices from
     Samsung, Motorola and LG, as well as a wireless router and some
     reference designs

   - Six NXP i.MX based machines, mostly industrial boards along with
     some reference designs

   - Mediatek sees some interesting Filogic based routers including the
     "OpenWRT One", a few new Chromebooks as well as single-board
     computers.

   - Four machines from Solidrun based on Marvell cn913x, replacing the
     older Armada 8000 based counterparts

   - The four Amlogic machines are all set top boxes or reference
     designs for them

   - The nine new Rockchips machines are mostly single-board computers
     including some interesting ones based on the rk3588 chip like the
     ROCK 5 ITX board and the CM3588 with its four NVMe slots

   - The RISC-V boards are all single-board computers based on Starfive
     JH7110, Microchip MPFS and Allwinner D1, which all had similar
     boards already

  There are also a lot of updates to already supported machines, notably
  for the TI K3, Rockchips, Freescale and of course Qualcomm platforms"

* tag 'soc-dt-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (846 commits)
  arm64: dts: allwinner: h616: add crypto engine node
  riscv: dts: add clock generator for Sophgo SG2042 SoC
  arm64: dts: rockchip: Add Xunlong Orange Pi 3B
  dt-bindings: arm: rockchip: Add Xunlong Orange Pi 3B
  arm64: dts: rockchip: Add Radxa ROCK 3B
  dt-bindings: arm: rockchip: Add Radxa ROCK 3B
  mailmap: Update Luca Weiss's email address
  ARM: dts: ixp4xx: nslu2: beeper uses PWM
  arm64: dts: rockchip: add ROCK 5 ITX board
  dt-bindings: arm: rockchip: Add ROCK 5 ITX board
  arm64: dts: rockchip: Add dma-names to uart1 on Pine64 rk3566 devices
  arm64: dts: rockchip: Add avdd supplies to hdmi on rock64
  arm64: dts: qcom: msm8916-lg-c50: add initial dts for LG Leon LTE
  arm64: dts: qcom: msm8916-lg-m216: Add initial device tree
  dt-bindings: arm: qcom: Add msm8916 based LG devices
  ARM: dts: qcom: msm8960: correct memory base
  arm64: dts: qcom: ipq9574: Add icc provider ability to gcc
  dt-bindings: interconnect: Add Qualcomm IPQ9574 support
  arm64: dts: qcom: sm8150: Add video clock controller node
  arm64: dts: qcom: pm6150: Add vibrator
  ...
2024-07-16 11:43:51 -07:00
Paolo Bonzini
86014c1e20 KVM generic changes for 6.11
- Enable halt poll shrinking by default, as Intel found it to be a clear win.
 
  - Setup empty IRQ routing when creating a VM to avoid having to synchronize
    SRCU when creating a split IRQCHIP on x86.
 
  - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
    that arch code can use for hooking both sched_in() and sched_out().
 
  - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
    truncating a bogus value from userspace, e.g. to help userspace detect bugs.
 
  - Mark a vCPU as preempted if and only if it's scheduled out while in the
    KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
    memory when retrieving guest state during live migration blackout.
 
  - A few minor cleanups
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEKTobbabEP7vbhhN9OlYIJqCjN/0FAmaRuOYACgkQOlYIJqCj
 N/1UnQ/8CI5Qfr+/0gzYgtWmtEMczGG+rMNpzD3XVqPjJjXcMcBiQnplnzUVLhha
 vlPdYVK7vgmEt003XGzV55mik46LHL+DX/v4hI3HEdblfyCeNLW3fKEWVRB44qJe
 o+YUQwSK42SORUp9oXuQINxhA//U9EnI7CQxlJ8w8wenv5IJKfIGr01DefmfGPAV
 PKm9t6WLcNqvhZMEyy/zmzM3KVPCJL0NcwI97x6sHxFpQYIDtL0E/VexA4AFqMoT
 QK7cSDC/2US41Zvem/r/GzM/ucdF6vb9suzZYBohwhxtVhwJe2CDeYQZvtNKJ1U7
 GOHPaKL6nBWdZCm/yyWbbX2nstY1lHqxhN3JD0X8wqU5rNcwm2b8Vfyav0Ehc7H+
 jVbDTshOx4YJmIgajoKjgM050rdBK59TdfVL+l+AAV5q/TlHocalYtvkEBdGmIDg
 2td9UHSime6sp20vQfczUEz4bgrQsh4l2Fa/qU2jFwLievnBw0AvEaMximkSGMJe
 b8XfjmdTjlOesWAejANKtQolfrq14+1wYw0zZZ8PA+uNVpKdoovmcqSOcaDC9bT8
 GO/NFUvoG+lkcvJcIlo1SSl81SmGLosijwxWfGvFAqsgpR3/3l3dYp0QtztoCNJO
 d3+HnjgYn5o5FwufuTD3eUOXH4AFjG108DH0o25XrIkb2Kymy0o=
 =BalU
 -----END PGP SIGNATURE-----

Merge tag 'kvm-x86-generic-6.11' of https://github.com/kvm-x86/linux into HEAD

KVM generic changes for 6.11

 - Enable halt poll shrinking by default, as Intel found it to be a clear win.

 - Setup empty IRQ routing when creating a VM to avoid having to synchronize
   SRCU when creating a split IRQCHIP on x86.

 - Rework the sched_in/out() paths to replace kvm_arch_sched_in() with a flag
   that arch code can use for hooking both sched_in() and sched_out().

 - Take the vCPU @id as an "unsigned long" instead of "u32" to avoid
   truncating a bogus value from userspace, e.g. to help userspace detect bugs.

 - Mark a vCPU as preempted if and only if it's scheduled out while in the
   KVM_RUN loop, e.g. to avoid marking it preempted and thus writing guest
   memory when retrieving guest state during live migration blackout.

 - A few minor cleanups
2024-07-16 09:51:36 -04:00
Masahiro Yamada
b9d73218d7 treewide: change conditional prompt for choices to 'depends on'
While Documentation/kbuild/kconfig-language.rst provides a brief
explanation, there are recurring confusions regarding the usage of a
prompt followed by 'if <expr>'. This conditional controls _only_ the
prompt.

A typical usage is as follows:

    menuconfig BLOCK
            bool "Enable the block layer" if EXPERT
            default y

When EXPERT=n, the prompt is hidden, but this config entry is still
active, and BLOCK is set to its default value 'y'. This is reasonable
because you are likely want to enable the block device support. When
EXPERT=y, the prompt is shown, allowing you to toggle BLOCK.

Please note that it is different from 'depends on EXPERT', which would
enable and disable the entire config entry.

However, this conditional prompt has never worked in a choice block.

The following two work in the same way: when EXPERT is disabled, the
choice block is entirely disabled.

[Test Code 1]

    choice
            prompt "choose" if EXPERT

    config A
            bool "A"

    config B
            bool "B"

    endchoice

[Test Code 2]

    choice
            prompt "choose"
            depends on EXPERT

    config A
            bool "A"

    config B
            bool "B"

    endchoice

I believe the first case should hide only the prompt, producing the
default:

   CONFIG_A=y
   # CONFIG_B is not set

The next commit will change (fix) the behavior of the conditional prompt
in choice blocks.

I see several choice blocks wrongly using a conditional prompt, where
'depends on' makes more sense.

To preserve the current behavior, this commit converts such misuses.

I did not touch the following entry in arch/x86/Kconfig:

    choice
            prompt "Memory split" if EXPERT
            default VMSPLIT_3G

This is truly the correct use of the conditional prompt; when EXPERT=n,
this choice block should silently select the reasonable VMSPLIT_3G,
although the resulting PAGE_OFFSET will not be affected anyway.

Presumably, the one in fs/jffs2/Kconfig is also correct, but I converted
it to 'depends on' to avoid any potential behavioral change.

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
2024-07-16 01:08:37 +09:00
Qingfang Deng
93b63f68d0
riscv: lib: relax assembly constraints in hweight
rd and rs don't have to be the same. In some cases where rs needs to be
saved for later usage, this will save us some mv instructions.

Signed-off-by: Qingfang Deng <qingfang.deng@siflower.com.cn>
Reviewed-by: Xiao Wang <xiao.w.wang@intel.com>
Link: https://lore.kernel.org/r/20240527092405.134967-1-dqfext@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-15 08:46:46 -07:00
Christophe Leroy
e6c0c03245 mm: provide mm_struct and address to huge_ptep_get()
On powerpc 8xx huge_ptep_get() will need to know whether the given ptep is
a PTE entry or a PMD entry.  This cannot be known with the PMD entry
itself because there is no easy way to know it from the content of the
entry.

So huge_ptep_get() will need to know either the size of the page or get
the pmd.

In order to be consistent with huge_ptep_get_and_clear(), give mm and
address to huge_ptep_get().

Link: https://lkml.kernel.org/r/cc00c70dd384298796a4e1b25d6c4eb306d3af85.1719928057.git.christophe.leroy@csgroup.eu
Signed-off-by: Christophe Leroy <christophe.leroy@csgroup.eu>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Cc: Jason Gunthorpe <jgg@nvidia.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-12 15:52:15 -07:00
yang.zhang
6ad8735994
riscv: set trap vector earlier
The exception vector of the booting hart is not set before enabling
the mmu and then still points to the value of the previous firmware,
typically _start. That makes it hard to debug setup_vm() when bad
things happen. So fix that by setting the exception vector earlier.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: yang.zhang <yang.zhang@hexintek.com>
Link: https://lore.kernel.org/r/20240508022445.6131-1-gaoshanliukou@163.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12 08:55:31 -07:00
Palmer Dabbelt
5ee121a393
Merge patch series "riscv: Apply Zawrs when available"
Andrew Jones <ajones@ventanamicro.com> says:

Zawrs provides two instructions (wrs.nto and wrs.sto), where both are
meant to allow the hart to enter a low-power state while waiting on a
store to a memory location. The instructions also both wait an
implementation-defined "short" duration (unless the implementation
terminates the stall for another reason). The difference is that while
wrs.sto will terminate when the duration elapses, wrs.nto, depending on
configuration, will either just keep waiting or an ILL exception will be
raised. Linux will use wrs.nto, so if platforms have an implementation
which falls in the "just keep waiting" category (which is not expected),
then it should _not_ advertise Zawrs in the hardware description.

Like wfi (and with the same {m,h}status bits to configure it), when
wrs.nto is configured to raise exceptions it's expected that the higher
privilege level will see the instruction was a wait instruction, do
something, and then resume execution following the instruction. For
example, KVM does configure exceptions for wfi (hstatus.VTW=1) and
therefore also for wrs.nto. KVM does this for wfi since it's better to
allow other tasks to be scheduled while a VCPU waits for an interrupt.
For waits such as those where wrs.nto/sto would be used, which are
typically locks, it is also a good idea for KVM to be involved, as it
can attempt to schedule the lock holding VCPU.

This series starts with Christoph's addition of the riscv
smp_cond_load_relaxed function which applies wrs.sto when available.
That patch has been reworked to use wrs.nto and to use the same approach
as Arm for the wait loop, since we can't have arbitrary C code between
the load-reserved and the wrs. Then, hwprobe support is added (since the
instructions are also usable from usermode), and finally KVM is
taught about wrs.nto, allowing guests to see and use the Zawrs
extension.

We still don't have test results from hardware, and it's not possible to
prove that using Zawrs is a win when testing on QEMU, not even when
oversubscribing VCPUs to guests. However, it is possible to use KVM
selftests to force a scenario where we can prove Zawrs does its job and
does it well. [4] is a test which does this and, on my machine, without
Zawrs it takes 16 seconds to complete and with Zawrs it takes 0.25
seconds.

This series is also available here [1]. In order to use QEMU for testing
a build with [2] is needed. In order to enable guests to use Zawrs with
KVM using kvmtool, the branch at [3] may be used.

[1] https://github.com/jones-drew/linux/commits/riscv/zawrs-v3/
[2] https://lore.kernel.org/all/20240312152901.512001-2-ajones@ventanamicro.com/
[3] https://github.com/jones-drew/kvmtool/commits/riscv/zawrs/
[4] cb2beccebc

Link: https://lore.kernel.org/r/20240426100820.14762-8-ajones@ventanamicro.com

* b4-shazam-merge:
  KVM: riscv: selftests: Add Zawrs extension to get-reg-list test
  KVM: riscv: Support guest wrs.nto
  riscv: hwprobe: export Zawrs ISA extension
  riscv: Add Zawrs support for spinlocks
  dt-bindings: riscv: Add Zawrs ISA extension description
  riscv: Provide a definition for 'pause'

Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12 08:55:29 -07:00
Andrew Jones
86d6a86e59
KVM: riscv: Support guest wrs.nto
When a guest traps on wrs.nto, call kvm_vcpu_on_spin() to attempt
to yield to the lock holding VCPU. Also extend the KVM ISA extension
ONE_REG interface to allow KVM userspace to detect and enable the
Zawrs extension for the Guest/VM.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Acked-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240426100820.14762-13-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12 08:54:51 -07:00
Andrew Jones
244c18fbf6
riscv: hwprobe: export Zawrs ISA extension
Export Zawrs ISA extension through hwprobe.

[Palmer: there's a gap in the numbers here as there will be a merge
conflict when this is picked up.  To avoid confusion I just set the
hwprobe ID to match what it would be post-merge.]

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20240426100820.14762-12-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12 08:53:50 -07:00
Paolo Bonzini
c8b8b8190a LoongArch KVM changes for v6.11
1. Add ParaVirt steal time support.
 2. Add some VM migration enhancement.
 3. Add perf kvm-stat support for loongarch.
 -----BEGIN PGP SIGNATURE-----
 
 iQJKBAABCAA0FiEEzOlt8mkP+tbeiYy5AoYrw/LiJnoFAmaOS6UWHGNoZW5odWFj
 YWlAa2VybmVsLm9yZwAKCRAChivD8uImehejD/9pACGe3h3krXLcFVWXOFIu5Hpc
 5kQLP0lSPJ/o5Xs8t/oPLrnDX70z90wXI1LOmltc7h32MSwFa2l8COQh+sN5eJBQ
 PNyt7u7bMipp0yJS4Gl3LQQ5vklcGOSpQc/gbeXnVx8J/tz+Mo9YGGLIXVRXRM6W
 Ri8D2VVFiwzQQYeTpPo1u1Ob8C6mA4KOppwvhscMTM3vj4NMbsinBzRnR0lG0Tdw
 meFhxDPly1Ksxsbnj9UGO6UnEY0A2SLONs6MiO4y4DtoqoDlw/lbqFJuYo4vvbx1
 pxtjyirD/PX/wjslQFWUOuU0hMfAodera+JupZ5BZWfcG8FltA4DQfDsm/U9RjK/
 7gGNnr8Xk2/tp6+4AVV+HU2iTgRvq+mXCL72zSy2Y4r7ElBAANDfk4n+Zn/PWisn
 U9wwV8Ue7tVB15BRpRsg77NzBidiCFEe/6flWYiX2y24ke71gwDJBGUy8hMdKt6t
 4Cq8atsU0MvDAzfYMsK9JjskJp4UFq6wb1tXbbuADM4TDhnzlK6s6h3vM+pFlh/f
 my7fDH8/2qsCWhBDM4pmsJskVp+I1GOk/80RjTQISwx7iHktJWvxNYTaisK2fvD5
 Qs1IUWfNFbDX0Lr0QpN6j6X4rZkghR4R6XoFkd4nkicwi+UHVn3oK9GSqv24QJn9
 7+Ev3dfRTUYLd6mC4Q==
 =DpIK
 -----END PGP SIGNATURE-----

Merge tag 'loongarch-kvm-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/chenhuacai/linux-loongson into HEAD

LoongArch KVM changes for v6.11

1. Add ParaVirt steal time support.
2. Add some VM migration enhancement.
3. Add perf kvm-stat support for loongarch.
2024-07-12 11:24:12 -04:00
Paolo Bonzini
60d2b2f3c4 KVM/riscv changes for 6.11
- Redirect AMO load/store access fault traps to guest
 - Perf kvm stat support for RISC-V
 - Use HW IMSIC guest files when available
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmaRDW4ACgkQrUjsVaLH
 LAcU2Q/+IaL17M8D8ueOcbmCMqZRReyVdR9vH6q87E9NJYRH2dewZ656bNQnnU20
 3hkbHOnF+NJAHJ0SfXwqNTVkJcQ8u+F3Xui4DlnFZ/lkpcWpvT/DRI5SCjIjiB/G
 SS/xWaRoSjvVJ7M8SyQhHUb2Y/tiDRXOOEl59ROGAKjzC3SY5/NJJ6g5FeE5akT8
 /Q7WisZmc+ZH+a9EEOnl+Do7AFakrlaFM5KnweamfqSlSFrQB12YNpSsmA16k6X9
 fqK/xPQTjeNakdQDPKw8INCbXkt8dsnlrPS6ivL0FCVf38aIJK0jxyLk9JbZGBK8
 +dGCJOLVJontEyOVTYheq2oWv40xAlkXDjLNbnz+Nf7Sau8evFBpE2mPnbUBoGZi
 fu5UCddSw3CFwrFNM+qiBRPz/mNuUpCC4pCh8yJSCDZ374ew9ili2l3Nb2IvBcJ2
 36lQuxlPVTPOv1J76/WtYwsSwaYBHHcBshweTJCkAkezp0d/wAE8bpaw3n4YnfSn
 l4u8/rrnEBb3Cd9cbW1Vk77Vw5e02RlZY5T+JLj7TXWSAFzstYxMpLsf097tqqcn
 vY1iTrpxTcJuY0Rra3SI05eKgliXI5snh08xlW2NiVxu8NjjZMU73b6tg3JX8FHl
 DMCafyQUBueV2jCpwbYribpbWv/UuUl92AKyJOwZ76W/e9YVBLA=
 =atJQ
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-6.11-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv changes for 6.11

- Redirect AMO load/store access fault traps to guest
- Perf kvm stat support for RISC-V
- Use guest files for IMSIC virtualization, when available

ONE_REG support for the Zimop, Zcmop, Zca, Zcf, Zcd, Zcb and Zawrs ISA
extensions is coming through the RISC-V tree.
2024-07-12 11:19:51 -04:00
Christoph Müllner
b8ddb0df30
riscv: Add Zawrs support for spinlocks
RISC-V code uses the generic ticket lock implementation, which calls
the macros smp_cond_load_relaxed() and smp_cond_load_acquire().
Introduce a RISC-V specific implementation of smp_cond_load_relaxed()
which applies WRS.NTO of the Zawrs extension in order to reduce power
consumption while waiting and allows hypervisors to enable guests to
trap while waiting. smp_cond_load_acquire() doesn't need a RISC-V
specific implementation as the generic implementation is based on
smp_cond_load_relaxed() and smp_acquire__after_ctrl_dep() sufficiently
provides the acquire semantics.

This implementation is heavily based on Arm's approach which is the
approach Andrea Parri also suggested.

The Zawrs specification can be found here:
https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc

Signed-off-by: Christoph Müllner <christoph.muellner@vrull.eu>
Co-developed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240426100820.14762-11-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12 03:16:42 -07:00
Andrew Jones
6da111574b
riscv: Provide a definition for 'pause'
If we're going to provide the encoding for 'pause' in cpu_relax()
anyway, then we can drop the toolchain checks and just always use
it. The advantage of doing this is that other code that need
pause don't need to also define it (yes, another use is coming).
Add the definition to insn-def.h since it's an instruction
definition and also because insn-def.h doesn't include much, so
it's safe to include from asm/vdso/processor.h without concern for
circular dependencies.

Signed-off-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240426100820.14762-9-ajones@ventanamicro.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-12 03:16:39 -07:00
Jakub Kicinski
7c8267275d Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

net/sched/act_ct.c
  26488172b0 ("net/sched: Fix UAF when resolving a clash")
  3abbd7ed8b ("act_ct: prepare for stolen verdict coming from conntrack and nat engine")

No adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-11 12:58:13 -07:00
Clément Léger
c9b8cd139c
riscv: hwprobe: export highest virtual userspace address
Some userspace applications (OpenJDK for instance) uses the free MSBs
in pointers to insert additional information for their own logic and
need to get this information from somewhere. Currently they rely on
parsing /proc/cpuinfo "mmu=svxx" string to obtain the current value of
virtual address usable bits [1]. Since this reflect the raw supported
MMU mode, it might differ from the logical one used internally which is
why arch_get_mmap_end() is used. Exporting the highest mmapable address
through hwprobe will allow a more stable interface to be used. For that
purpose, add a new hwprobe key named
RISCV_HWPROBE_KEY_HIGHEST_VIRT_ADDRESS which will export the highest
userspace virtual address.

Link: https://github.com/openjdk/jdk/blob/master/src/hotspot/os_cpu/linux_riscv/vm_version_linux_riscv.cpp#L171 [1]
Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240410144558.1104006-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-11 08:57:33 -07:00
Thorsten Blum
fb9086e95a riscv: Remove unnecessary int cast in variable_fls()
__builtin_clz() returns an int and casting the whole expression to int
is unnecessary. Remove it.

Signed-off-by: Thorsten Blum <thorsten.blum@toblux.com>
Signed-off-by: Yury Norov <yury.norov@gmail.com>
2024-07-10 14:30:35 -07:00
Alexandre Ghiti
16badacd8a
riscv: Improve sbi_ecall() code generation by reordering arguments
The sbi_ecall() function arguments are not in the same order as the
ecall arguments, so we end up re-ordering the registers before the
ecall which is useless and costly.

So simply reorder the arguments in the same way as expected by ecall.
Instead of reordering directly the arguments of sbi_ecall(), use a proxy
macro since the current ordering is more natural.

Before:

Dump of assembler code for function sbi_ecall:
   0xffffffff800085e0 <+0>: add sp,sp,-32
   0xffffffff800085e2 <+2>: sd s0,24(sp)
   0xffffffff800085e4 <+4>: mv t1,a0
   0xffffffff800085e6 <+6>: add s0,sp,32
   0xffffffff800085e8 <+8>: mv t3,a1
   0xffffffff800085ea <+10>: mv a0,a2
   0xffffffff800085ec <+12>: mv a1,a3
   0xffffffff800085ee <+14>: mv a2,a4
   0xffffffff800085f0 <+16>: mv a3,a5
   0xffffffff800085f2 <+18>: mv a4,a6
   0xffffffff800085f4 <+20>: mv a5,a7
   0xffffffff800085f6 <+22>: mv a6,t3
   0xffffffff800085f8 <+24>: mv a7,t1
   0xffffffff800085fa <+26>: ecall
   0xffffffff800085fe <+30>: ld s0,24(sp)
   0xffffffff80008600 <+32>: add sp,sp,32
   0xffffffff80008602 <+34>: ret

After:

Dump of assembler code for function __sbi_ecall:
   0xffffffff8000b6b2 <+0>:	add	sp,sp,-32
   0xffffffff8000b6b4 <+2>:	sd	s0,24(sp)
   0xffffffff8000b6b6 <+4>:	add	s0,sp,32
   0xffffffff8000b6b8 <+6>:	ecall
   0xffffffff8000b6bc <+10>:	ld	s0,24(sp)
   0xffffffff8000b6be <+12>:	add	sp,sp,32
   0xffffffff8000b6c0 <+14>:	ret

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240322112629.68170-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-10 13:30:56 -07:00
Samuel Holland
56c1c1a09a
riscv: Add tracepoints for SBI calls and returns
These are useful for measuring the latency of SBI calls. The SBI HSM
extension is excluded because those functions are called from contexts
such as cpuidle where instrumentation is not allowed.

Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240321230131.1838105-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-10 13:23:09 -07:00
Xiao Wang
a43fe27d65
riscv: Optimize crc32 with Zbc extension
As suggested by the B-ext spec, the Zbc (carry-less multiplication)
instructions can be used to accelerate CRC calculations. Currently, the
crc32 is the most widely used crc function inside kernel, so this patch
focuses on the optimization of just the crc32 APIs.

Compared with the current table-lookup based optimization, Zbc based
optimization can also achieve large stride during CRC calculation loop,
meantime, it avoids the memory access latency of the table-lookup based
implementation and it reduces memory footprint.

If Zbc feature is not supported in a runtime environment, then the
table-lookup based implementation would serve as fallback via alternative
mechanism.

By inspecting the vmlinux built by gcc v12.2.0 with default optimization
level (-O2), we can see below instruction count change for each 8-byte
stride in the CRC32 loop:

rv64: crc32_be (54->31), crc32_le (54->13), __crc32c_le (54->13)
rv32: crc32_be (50->32), crc32_le (50->16), __crc32c_le (50->16)

The compile target CPU is little endian, extra effort is needed for byte
swapping for the crc32_be API, thus, the instruction count change is not
as significant as that in the *_le cases.

This patch is tested on QEMU VM with the kernel CRC32 selftest for both
rv64 and rv32. Running the CRC32 selftest on a real hardware (SpacemiT K1)
with Zbc extension shows 65% and 125% performance improvement respectively
on crc32_test() and crc32c_test().

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240621054707.1847548-1-xiao.w.wang@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-10 13:19:50 -07:00
Arnd Bergmann
3db80c999d riscv: convert to generic syscall table
The uapi/asm/unistd_{32,64}.h and asm/syscall_table_{32,64}.h headers can
now be generated from scripts/syscall.tbl, which makes this consistent
with the other architectures that have their own syscall.tbl.

riscv has two extra system call that gets added to scripts/syscall.tbl.

The newstat and rlimit entries in the syscall_abis_64 line are for system
calls that were part of the generic ABI when riscv64 got added but are
no longer enabled by default for new architectures. Both riscv32 and
riscv64 also implement memfd_secret, which is optional for all
architectures.

Unlike all the other 32-bit architectures, the time32 and stat64
sets of syscalls are not enabled on riscv32.

Both the user visible side of asm/unistd.h and the internal syscall
table in the kernel should have the same effective contents after this.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-10 14:23:38 +02:00
Arnd Bergmann
505d66d1ab clone3: drop __ARCH_WANT_SYS_CLONE3 macro
When clone3() was introduced, it was not obvious how each architecture
deals with setting up the stack and keeping the register contents in
a fork()-like system call, so this was left for the architecture
maintainers to implement, with __ARCH_WANT_SYS_CLONE3 defined by those
that already implement it.

Five years later, we still have a few architectures left that are missing
clone3(), and the macro keeps getting in the way as it's fundamentally
different from all the other __ARCH_WANT_SYS_* macros that are meant
to provide backwards-compatibility with applications using older
syscalls that are no longer provided by default.

Address this by reversing the polarity of the macro, adding an
__ARCH_BROKEN_SYS_CLONE3 macro to all architectures that don't
already provide the syscall, and remove __ARCH_WANT_SYS_CLONE3
from all the other ones.

Acked-by: Geert Uytterhoeven <geert@linux-m68k.org>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-10 14:23:38 +02:00
Paolo Abeni
7b769adc26 bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZoxN0AAKCRDbK58LschI
 g0c5AQDa3ZV9gfbN42y1zSDoM1uOgO60fb+ydxyOYh8l3+OiQQD/fLfpTY3gBFSY
 9yi/pZhw/QdNzQskHNIBrHFGtJbMxgs=
 =p1Zz
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-07-08

The following pull-request contains BPF updates for your *net-next* tree.

We've added 102 non-merge commits during the last 28 day(s) which contain
a total of 127 files changed, 4606 insertions(+), 980 deletions(-).

The main changes are:

1) Support resilient split BTF which cuts down on duplication and makes BTF
   as compact as possible wrt BTF from modules, from Alan Maguire & Eduard Zingerman.

2) Add support for dumping kfunc prototypes from BTF which enables both detecting
   as well as dumping compilable prototypes for kfuncs, from Daniel Xu.

3) Batch of s390x BPF JIT improvements to add support for BPF arena and to implement
   support for BPF exceptions, from Ilya Leoshkevich.

4) Batch of riscv64 BPF JIT improvements in particular to add 12-argument support
   for BPF trampolines and to utilize bpf_prog_pack for the latter, from Pu Lehui.

5) Extend BPF test infrastructure to add a CHECKSUM_COMPLETE validation option
   for skbs and add coverage along with it, from Vadim Fedorenko.

6) Inline bpf_get_current_task/_btf() helpers in the arm64 BPF JIT which gives
   a small 1% performance improvement in micro-benchmarks, from Puranjay Mohan.

7) Extend the BPF verifier to track the delta between linked registers in order
   to better deal with recent LLVM code optimizations, from Alexei Starovoitov.

8) Fix bpf_wq_set_callback_impl() kfunc signature where the third argument should
   have been a pointer to the map value, from Benjamin Tissoires.

9) Extend BPF selftests to add regular expression support for test output matching
   and adjust some of the selftest when compiled under gcc, from Cupertino Miranda.

10) Simplify task_file_seq_get_next() and remove an unnecessary loop which always
    iterates exactly once anyway, from Dan Carpenter.

11) Add the capability to offload the netfilter flowtable in XDP layer through
    kfuncs, from Florian Westphal & Lorenzo Bianconi.

12) Various cleanups in networking helpers in BPF selftests to shave off a few
    lines of open-coded functions on client/server handling, from Geliang Tang.

13) Properly propagate prog->aux->tail_call_reachable out of BPF verifier, so
    that x86 JIT does not need to implement detection, from Leon Hwang.

14) Fix BPF verifier to add a missing check_func_arg_reg_off() to prevent an
    out-of-bounds memory access for dynpointers, from Matt Bobrowski.

15) Fix bpf_session_cookie() kfunc to return __u64 instead of long pointer as
    it might lead to problems on 32-bit archs, from Jiri Olsa.

16) Enhance traffic validation and dynamic batch size support in xsk selftests,
    from Tushar Vyavahare.

bpf-next-for-netdev

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (102 commits)
  selftests/bpf: DENYLIST.aarch64: Remove fexit_sleep
  selftests/bpf: amend for wrong bpf_wq_set_callback_impl signature
  bpf: helpers: fix bpf_wq_set_callback_impl signature
  libbpf: Add NULL checks to bpf_object__{prev_map,next_map}
  selftests/bpf: Remove exceptions tests from DENYLIST.s390x
  s390/bpf: Implement exceptions
  s390/bpf: Change seen_reg to a mask
  bpf: Remove unnecessary loop in task_file_seq_get_next()
  riscv, bpf: Optimize stack usage of trampoline
  bpf, devmap: Add .map_alloc_check
  selftests/bpf: Remove arena tests from DENYLIST.s390x
  selftests/bpf: Add UAF tests for arena atomics
  selftests/bpf: Introduce __arena_global
  s390/bpf: Support arena atomics
  s390/bpf: Enable arena
  s390/bpf: Support address space cast instruction
  s390/bpf: Support BPF_PROBE_MEM32
  s390/bpf: Land on the next JITed instruction after exception
  s390/bpf: Introduce pre- and post- probe functions
  s390/bpf: Get rid of get_probe_mem_regno()
  ...
====================

Link: https://patch.msgid.link/20240708221438.10974-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-07-09 17:01:46 +02:00
Arnd Bergmann
95ab7b209b RISC-V Devicetrees for v6.11
Sopgho:
 Add clock support for SG2042.
 
 Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQGzBAABCgAdFiEEdoBX2jyDC9ZCTwZjDCzASqG0i0IFAmaMg3wACgkQDCzASqG0
 i0Jh1wv/SVACZmbHdQ0ldCrjdPeSDQ5cN+TP2NHihFfE8+H7rzP/baHv9EnezGVQ
 Jcjw4Qj8uYqggN0wpWbWvuHC8tNPSrYIoQgbtnjyLBEjtwFINSOx7vwvldBszN1g
 ZXQuAf5KRyuBAcqUTVUBqX9bsZ9BvNvrV3xcLL9PZzH05tQiQG4J5ix9DF6wfkNW
 7O0Ibz88EnTAUOhqLWiR+tOB6Us/fN2lZPgyVHjqy3L3OqwsDOtGjMlfDuCW2SJk
 Yw9Yi1tjSCr3Vipvz5RgJfuCElgOTgJCcyW6EkR9aBmVflpBQpFFYsw88EFA7qeO
 sKWKdln2+DuX9xqneDmpVLr7uEBP3QDdI+lr+gjC07k3lVO778meOqYa+NHIFhRL
 zT+VZ9NPD311/mNXt67yUoBntA7SgeaE6hPpcYeXEpl05T4nN9rdyFFNWm/IF3qH
 7gd07Z78e8Cl02fz4V2rMdK61+2vfDbCITrCmbCChaWAdiPgI2Glo6ejGns+xzcw
 dPzLpga/
 =NA9d
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaM+eMACgkQYKtH/8kJ
 UicYiw//ZH53sAC+TS2NiaW6GeqcXf5FOyt5uJpjG/Sga+OJvWqlL4rd9068aTGv
 Jvx0ZbFm0pml3OaVH7k23xyyGQN74v12Wnj9mZvKaTXTv2MKJfUMN+v3HkbUcju8
 OBGrQcYdxWCg7ZYSDVYdm+ni6HwxiEripjRaVLbBIiAog+HkDtFs0ynpMLDv38MN
 0rdFT5v7Zmfcz226De4zmhA2HK/1eP+vhsl3A/BqozUOk4z6J2eJQ8dUIwri8Cn3
 YAttT0DPasdi3C8DBW0YZpwRIPYcpmPpjgtUyuc6LnqwWnsw+2HPCZsvpT0P336F
 67ZHjYcKPQSPwqCicIgY94s2oqCWuDftD5kUfiQR7y6qfmQB/g0G3PCStH+DHBCC
 SXWOlRjIiACCycMVa1b7CKxCbo524/TjKd6Buf9RpxxrYV6IFaCMx6KH+rQMMa+f
 UV8yXJ7bP0OttvX4jpMelZpdH52Zk6SciJ2MHI7Ca2PHWBKeIPsWmMQeMqEHJ4hi
 ecEZIHMJ3iuhZ1aCv11LMtvFIDUVRyzgPybp241MXiCeSE0pg8QfiQazfnqidwsw
 a7hYb7ic+sCRhNx0QoFnGWQf4Y8PCf/8/oVHXyHFUGtTDM25Z4a1Prp5ax9JW0xx
 FTt7TRIxCTVTrCHWliEk2SiUbje9PWl3lCIPITOt7B9ZW5jYazA=
 =kAb0
 -----END PGP SIGNATURE-----

Merge tag 'riscv-sophgo-dt-for-v6.11' of https://github.com/sophgo/linux into soc/dt

RISC-V Devicetrees for v6.11

Sopgho:
Add clock support for SG2042.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>

* tag 'riscv-sophgo-dt-for-v6.11' of https://github.com/sophgo/linux:
  riscv: dts: add clock generator for Sophgo SG2042 SoC

Link: https://lore.kernel.org/r/PN1P287MB281861EA2B1706B430D2FA3EFEDB2@PN1P287MB2818.INDP287.PROD.OUTLOOK.COM
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-09 10:50:42 +02:00
Chen Wang
b1240a3951 riscv: dts: add clock generator for Sophgo SG2042 SoC
Add clock generator node to device tree for SG2042, and enable clock for
uart.

Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
Reviewed-by: Guo Ren <guoren@kernel.org>
2024-07-09 08:19:52 +08:00
Arnd Bergmann
43528789a0 RISC-V config update for v6.11
StarFive:
 Enable most of the options needed for the jh7100 based boards to be
 properly testable with defconfig.
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZorLpAAKCRB4tDGHoIJi
 0jCRAPwP4dWXvXRr3889fZKWhwG8QgPp4WOMWUN6qEXXrc5p1gD+L6VOKPblFH6e
 4cVv8BwYHpqhhVt0P+wU6lvtq2xnLAc=
 =PAkK
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaMPwMACgkQYKtH/8kJ
 UieEvQ//XP74fK0j/RCHTJlTzpT7J6rcl5dumQ6Y4Abt+DQTFcwC5evMmFwLcZFc
 WBVsDiuJi7NXJhIJaXUEbAjQhgAMR+lNm985Gi1dKMEuYGtUvpoNH2D4h0/RBJJH
 rnJ3FNZBwkfsXf7pF3BY8jlBQk9Qu9yGYfiD3F2sjuClLYvU0AIjv0cO7yph/pUR
 NLrgfPr748A5SnwbQDN9zO8S6uv6hjpL4qG0q/ao9UjiD76yTK61Xet27KRr9coE
 B3ylawAw484D72/iJBtoYgtqZlUDCTJQxsXIE9Fxz3kfGzCIhEV4dG0LSqqNkn8I
 nq3hUkoONt1HYkgOP5HkV4Nl51Z0Haf7Nu4h0ikORg6VFGGKeou+6MYrc/tRNgTl
 bqa0hKAyeQhr/xZuPgKFve8prIeJNoBaxLEzC1pvk++pgfyLaCMvu2/8uH0C1mhW
 dH7TxvmF/P7nsaGANOIWD7cz7d4Gear0qiFYXsNHOBRoHR2wHpqipiVTIv4cgw4A
 u4NWC9e9OKoB1HYqaUYreHkMYuocCLvwAj+Bi22kH5YgofbQx/zn4GCQaQn6XX6f
 /s9XV+Tn8kt/wz39DJr9MVTFoJ0YdUlJwCYmA+F2qz+DXDOmExj14U/sP7nnKqgR
 jEolQVmRv9PjcTjhu48P1NBQb5SsWINUZH9cz8+rUVmrdBO6vsU=
 =ib+4
 -----END PGP SIGNATURE-----

Merge tag 'riscv-config-for-v6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/defconfig

RISC-V config update for v6.11

StarFive:
Enable most of the options needed for the jh7100 based boards to be
properly testable with defconfig.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-config-for-v6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: defconfig: Enable StarFive JH7110 drivers

Link: https://lore.kernel.org/r/20240707-unused-outflank-aa127ccb2cfe@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-08 21:33:23 +02:00
Arnd Bergmann
31f6b5a651 RISC-V Devicetrees for v6.11
T-Head:
 Last change from me before this starts going via Drew's tree is the
 addition of the SBI PMU events node for the th1520.
 
 StarFive:
 A dts for the Pin64 Star64, another board with a jh7110 SoC. This board
 is almost identical to the existing Milk-v Mars and VisionFive 2 boards
 that are already support - just with a different PHY configuration and
 only one of the two PCIe ports exposed. Additionally, the Mars and
 VisionFive 2 get their PCie configuration added.
 
 Microchip:
 A dts for the BeagleV Fire. PCIe is disabled on it for now, as some
 binding and driver changes are required.
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZorVOAAKCRB4tDGHoIJi
 0j35AQDgZyo3BkwRVpEqGo4GXNWvOPPfiEAuQ+m8hlnDG8PteQD+LOkcfjK/Y2Bk
 vj6vhJYGy/9yODRXaI756szEf6VGDg8=
 =Tefh
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaL/p0ACgkQYKtH/8kJ
 UicuDQ/9EQiTkV7dluXIl1/imUBIR1cxqH7nLDjy4A7aC5woNBIPKJA0yDI9DAGn
 1hDcQG9zqLNmtpPtfXnlIFMmHoA2DF3UNH+34/RjHkPoweg2J+t+k0SdYL+DRbJi
 cdLNChYScauZAzOllMLR82Z4k8bBNYAUheF/yMnRL6m928slx1bxYMp7u080VhVA
 eN3RbLlsvyeifl/wyhH+U7WgQvvJ5WtVfq3bEGjUo26izGTIqh6FBiguBcj0yGvH
 5tOT6NKtXfuZQPivo6hDGaedgz6FDjfGBGSk48Cg6arcktHNAeyUNwjIMv19w+si
 FW8LZiNXFB8l3nVO/CoqjEcwO5Z2dyeJvp7CQSOcJYm/3XoQSnH38nm0Qmf6+NWY
 pJ3wG0CIbPgSAt2bbumtYW/WRRtrBUEU0de5aZzXGH/iK3aF0bdkQAKj9DOmnc1h
 3dhYkxSXPMbtuS6Zf9zvlchiRyN+CnVhAzqQlf4ibQ6NgfHS9c/ZyKsbRbHGRJGm
 Q5XH/9XgSEp4hCr8/U0e3ZPnuBe1IBfAW2BTVLoKKmWpq/VLqiHvxxrlF9B8ufoU
 /j0tEGqNSC+aUCFvZtn9FMb+KPaQCqj/qRRuFaLlA2zfrv7PTOl9w6sEa6x3+PoL
 u/AIvwsZW1ptuN5Ou72gybxOL/zIEd9urchfAZfDQ+6DoXENgTw=
 =tdm/
 -----END PGP SIGNATURE-----

Merge tag 'riscv-dt-for-v6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into soc/dt

RISC-V Devicetrees for v6.11

T-Head:
Last change from me before this starts going via Drew's tree is the
addition of the SBI PMU events node for the th1520.

StarFive:
A dts for the Pin64 Star64, another board with a jh7110 SoC. This board
is almost identical to the existing Milk-v Mars and VisionFive 2 boards
that are already support - just with a different PHY configuration and
only one of the two PCIe ports exposed. Additionally, the Mars and
VisionFive 2 get their PCie configuration added.

Microchip:
A dts for the BeagleV Fire. PCIe is disabled on it for now, as some
binding and driver changes are required.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>

* tag 'riscv-dt-for-v6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux:
  riscv: dts: starfive: add PCIe dts configuration for JH7110
  riscv: dts: microchip: add an initial devicetree for the BeagleV Fire
  dt-bindings: riscv: microchip: document beaglev-fire
  riscv: dts: starfive: Update flash partition layout
  riscv: dts: thead: th1520: Add PMU event node
  riscv: dts: starfive: add Star64 board devicetree
  dt-bindings: riscv: starfive: add Star64 board compatible
  dt-bindings: riscv: Add T-HEAD C908 compatible

Link: https://lore.kernel.org/r/20240707-nuttiness-lustfully-4aaf03c991b2@spud
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-08 16:58:37 +02:00
Arnd Bergmann
9289b97a67 Allwinner SoC device tree changes for 6.11
This includes a commit shared with the clk tree. This commit adds clock
 and reset indices to the device tree binding, and thus is needed for
 both the device tree and driver changes.
 
 ARM64 device tree and binding-only changes
 - Add LRADC (low resolution ADC for resistor network based keys) for H616 SoC
 - Add cache information for A64, H6, and H616 SoCs
 - Correct model names and descriptions for Pine64 boards
 - Add GPADC (general purpose ADC) for H616 SoC
 - Add ADC joysticks based on GPADC for anbernic-rg35xx-h board
 - Add additional CPU OPPs for the H700 on top of existing H616 ones
 - Enable DVFS for rg35xx boards
 - Add IOMMU for H616 SoC
 
 RISC-V device tree changes
 - Add system LDOs to D1s/T113 SoC
 - Add ClockworkPi and DevTerm device trees
 -----BEGIN PGP SIGNATURE-----
 
 iQJCBAABCgAsFiEE2nN1m/hhnkhOWjtHOJpUIZwPJDAFAmaC5BEOHHdlbnNAY3Np
 ZS5vcmcACgkQOJpUIZwPJDDszRAAohtAXVlhcOZUyY6jnToNDRhRDe4WR2dKNuth
 D/sM93uBxEtHkHsV11xKdmXALYszT3c1IxA84Cj3uwBvutIK4/KQUDbtDvGedz/o
 90v3bgAR1ETf7RrQFCxpQUFYNufpnXb5ZHA5jRSdxyg9coLR7sDab0r6UCh/EzyN
 rykvB2oSmjxyG0OjBUU3IcR0B5nliGRc9esVhdSW4cGblaJaT/NrlsmUGL39hXP2
 BayJ3FFfS2izs5uHMOeQUaveadBGssc/kzJvnwy/jeM+76uY4SlH58/dl0ivWM0/
 h5mifWbMIdl5kUEkUz7OaPT9O6K/1O1/heQYrQSHNKtgD6NN9uS4QWuGyfsWcpUo
 iRXnYrscWuGQH8ICAG9upNCto2RXn1nS3+FYv/CmNQ+Nl39RpGGWKW9KPesCttdr
 OgjWGgspyBT3xh4NhrxLbEOmZaCg6ZGfr8Vp4VrXLMbSOrV6Qu8VsJcRWkfwm3yb
 tg9ONSFrSBNwmWQIqjtpWt3j81sOKkz2WYV6+hNgDyQ5HZQZWNHhOUVQ2W912TNh
 vJ9d+3jgtn9wj2DyPdkzPRaNaVK+KqRbpUIXLc7TUAJaMcHzX6+ekt1576ZBNUYL
 hFOoMvzu6k02Fmr8qUe9PX2p5Ie6ubF+2/aWGkZ0pF2cdht1Q6IDiXsSX3PW+VBU
 acB1TT0=
 =3vmg
 -----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaL9/8ACgkQYKtH/8kJ
 UifiHRAA0qHvjj5uaQiQnjJgOFF+Rmf3goNje2S9f35ziO2AGKqsfp651mBbZFpJ
 yFkIMlELMBLmgB9cEXfvoWlvUjITxduiMjn0aNfu4czRhhk1qiqwy+CVhDHT681g
 71s1crZ0gT/B9hNQkfHz6AmrvMl6k9NPHioaci4abRNXsGNrZuuSFNOAC82l8nrb
 PLJZtNOEbtkVP5pdBatTlsdJywOLpl1KwYvXbbC8oXvMobywUSsW/gjcHV0hUjlp
 4kwDEyi1cUGeNEuPAcp2Pp0H+WtHJsaTN1BhIy3kyCGiCXX3ShkZiaYIPwBGdUrz
 NJihJ/WoU2pMzlX5OH2L2d1KybZhOPB+kzSqw2TWyZuKNc+HY3BaXeZ8uVUz59Tv
 WSml2dz4+T816jDcBrK9sFC1uiJlbieKNJsH8Ol0FykJcfjD48tP7Y1uMhqN3ict
 y/FZ6nvPNEf7YPGGTXDjn0d5ifXAyGXLhIWxRXBP701HAzTkUPeTEdMLwj+6H1eh
 HBxFaa4yeAtH9+0c/Tf1Ge244OxbavwjJU4ufyD7X440nHqFVXQvF2FD451tBd5F
 HbXZ3cm+E8/zg55L0sAcY++5W+Aa2S+TvK64nrKcGEEDonBj3WwdNL5iwtzugWCv
 /AQg1UP3uQIyPvrwcyyoG8lenyHjOhZnACH3qbxpO9JnnwEIXQk=
 =Emn4
 -----END PGP SIGNATURE-----

Merge tag 'sunxi-dt-for-6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux into soc/dt

Allwinner SoC device tree changes for 6.11

This includes a commit shared with the clk tree. This commit adds clock
and reset indices to the device tree binding, and thus is needed for
both the device tree and driver changes.

ARM64 device tree and binding-only changes
- Add LRADC (low resolution ADC for resistor network based keys) for H616 SoC
- Add cache information for A64, H6, and H616 SoCs
- Correct model names and descriptions for Pine64 boards
- Add GPADC (general purpose ADC) for H616 SoC
- Add ADC joysticks based on GPADC for anbernic-rg35xx-h board
- Add additional CPU OPPs for the H700 on top of existing H616 ones
- Enable DVFS for rg35xx boards
- Add IOMMU for H616 SoC

RISC-V device tree changes
- Add system LDOs to D1s/T113 SoC
- Add ClockworkPi and DevTerm device trees

* tag 'sunxi-dt-for-6.11' of https://git.kernel.org/pub/scm/linux/kernel/git/sunxi/linux:
  riscv: dts: allwinner: Add ClockworkPi and DevTerm devicetrees
  riscv: dts: allwinner: d1s-t113: Add system LDOs
  arm64: dts: allwinner: h616: add IOMMU node
  arm64: dts: allwinner: rg35xx: Enable DVFS CPU frequency scaling
  arm64: dts: allwinner: h616: add additional CPU OPPs for the H700
  arm64: dts: allwinner: anbernic-rg35xx-h: Add ADC joysticks
  arm64: dts: allwinner: h616: Add GPADC device node
  dt-bindings: clock: sun50i-h616-ccu: Add GPADC clocks
  ARM: dts: sunxi: remove duplicated entries in makefile
  arm64: dts: allwinner: Add cache information to the SoC dtsi for H616
  arm64: dts: allwinner: Add cache information to the SoC dtsi for A64
  arm64: dts: allwinner: Correct the model names for Pine64 boards
  dt-bindings: arm: sunxi: Correct the descriptions for Pine64 boards
  arm64: dts: allwinner: Add cache information to the SoC dtsi for H6
  ARM: dts: sun50i: Add LRADC node
  dt-bindings: input: sun4i-lradc-keys: Add H616 compatible

Link: https://lore.kernel.org/r/ZoQa8r1N8yi7FlPV@wens.tw
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-07-08 16:30:23 +02:00
Puranjay Mohan
a5912c37fa riscv, bpf: Optimize stack usage of trampoline
When BPF_TRAMP_F_CALL_ORIG is not set, stack space for passing arguments
on stack doesn't need to be reserved because the original function is
not called.

Only reserve space for stacked arguments when BPF_TRAMP_F_CALL_ORIG is
set.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/bpf/20240708114758.64414-1-puranjay@kernel.org
2024-07-08 15:44:08 +02:00
Linus Torvalds
b673f2bda0 RISC-V Fixes for 6.10-rc7
* A fix for the CMODX example in therecently added icache flushing
   prctl().
 * A fix to the perf driver to avoid corrupting event data on counter
   overflows when external overflow handlers are in use.
 * A fix to clear all hardware performance monitor events on boot, to
   avoid dangling events firmware or previously booted kernels from
   triggering spuriously.
 * A fix to the perf event probing logic to avoid erroneously reporting
   the presence of unimplemented counters.  This also prevents some
   implemented counters from being reported.
 * A build fix for the vector sigreturn selftest on clang.
 * A fix to ftrace, which now requires the previously optional index
   argument to ftrace_graph_ret_addr().
 * A fix to avoid deadlocking if kexec crash handling triggers in an
   interrupt context.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmaIJBATHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiYk6D/981DUAWJ5JPsqve7PihWnhFXh7T/fm
 KZL7cNQN7/9QmqzJMD756oQCHZT2TeDTxwji4WUQo27uoS1SamsAxRWCPdW8GqDt
 GwBJeviyWDwjNMgrejWwgH3d9so+WZ4kNKfiUrY+j1vgQ8TkE4h5wMzUtOBTSgDI
 5EhHT5B5yjiRcadPshXivZAyimc6mxKJKph5v8W3BGgtLQRHs5tYop4ZkP5Utmv3
 yBie7orfMRx5fNxE6fgn0c/3r49i+KGTSCzkK+0689qPlQNt7MTj4kqDVp7xu2ll
 jl5GJNZrWSZR0cST9AG3VByqfeN2f9sbGYq5fAozkZy3idEYovtvGIU2xJVZRuIU
 ZhY+VTk0fwO8HlilTLMbyk7t99EJ4a7bXcUuD6ub3BthlKfc41PArhZgasL/dFPd
 VOSjy5hfGpJgmifSTpPXElf8jgBq6N4Kw9N+rBNkNiruEiwtWfsyqOckYAfNbULe
 Z8Nikl+3pfWlwzQrAb30X78s4ZyJyOX+XxP118lvx+UAbZofxg5qJJGo7U0Ru54r
 JPBCW8swlco6AXwvAj3yKcaL3qtKlc6f068QvcSaRELUvS2qfuJ7w4fjKdl/IT93
 QggGUyuEVG3UC1Dj961plrACXmqISTAlW8HqkdPvUgLY9rSPuTLuCR54b3fGI+n/
 3wJF6gl5leEPMw==
 =/Gsf
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for the CMODX example in the recently added icache flushing
   prctl()

 - A fix to the perf driver to avoid corrupting event data on counter
   overflows when external overflow handlers are in use

 - A fix to clear all hardware performance monitor events on boot, to
   avoid dangling events firmware or previously booted kernels from
   triggering spuriously

 - A fix to the perf event probing logic to avoid erroneously reporting
   the presence of unimplemented counters. This also prevents some
   implemented counters from being reported

 - A build fix for the vector sigreturn selftest on clang

 - A fix to ftrace, which now requires the previously optional index
   argument to ftrace_graph_ret_addr()

 - A fix to avoid deadlocking if kexec crash handling triggers in an
   interrupt context

* tag 'riscv-for-linus-6.10-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: kexec: Avoid deadlock in kexec crash path
  riscv: stacktrace: fix usage of ftrace_graph_ret_addr()
  riscv: selftests: Fix vsetivli args for clang
  perf: RISC-V: Check standard event availability
  drivers/perf: riscv: Reset the counter to hpmevent mapping while starting cpus
  drivers/perf: riscv: Do not update the event data if uptodate
  documentation: Fix riscv cmodx example
2024-07-05 12:22:51 -07:00
Jakub Kicinski
76ed626479 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/phy/aquantia/aquantia.h
  219343755e ("net: phy: aquantia: add missing include guards")
  61578f6793 ("net: phy: aquantia: add support for PHY LEDs")

drivers/net/ethernet/wangxun/libwx/wx_hw.c
  bd07a98178 ("net: txgbe: remove separate irq request for MSI and INTx")
  b501d261a5 ("net: txgbe: add FDIR ATR support")
https://lore.kernel.org/all/20240703112936.483c1975@canb.auug.org.au/

include/linux/mlx5/mlx5_ifc.h
  048a403648 ("net/mlx5: IFC updates for changing max EQs")
  99be56171f ("net/mlx5e: SHAMPO, Re-enable HW-GRO")
https://lore.kernel.org/all/20240701133951.6926b2e3@canb.auug.org.au/

Adjacent changes:

drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c
  4130c67cd1 ("wifi: iwlwifi: mvm: check vif for NULL/ERR_PTR before dereference")
  3f3126515f ("wifi: iwlwifi: mvm: add mvm-specific guard")

include/net/mac80211.h
  816c6bec09 ("wifi: mac80211: fix BSS_CHANGED_UNSOL_BCAST_PROBE_RESP")
  5a009b42e0 ("wifi: mac80211: track changes in AP's TPE")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-07-04 14:16:11 -07:00
Bang Li
8f65aa3223 mm: implement update_mmu_tlb() using update_mmu_tlb_range()
Let's make update_mmu_tlb() simply a generic wrapper around
update_mmu_tlb_range().  Only the latter can now be overridden by the
architecture.  We can now remove __HAVE_ARCH_UPDATE_MMU_TLB as well.

Link: https://lkml.kernel.org/r/20240522061204.117421-3-libang.li@antgroup.com
Signed-off-by: Bang Li <libang.li@antgroup.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:29:57 -07:00
Bang Li
23b1b44e6c mm: add update_mmu_tlb_range()
Patch series "Add update_mmu_tlb_range() to simplify code", v4.

This series of commits mainly adds the update_mmu_tlb_range() to batch
update tlb in an address range and implement update_mmu_tlb() using
update_mmu_tlb_range().

After commit 19eaf44954 ("mm: thp: support allocation of anonymous
multi-size THP"), We may need to batch update tlb of a certain address
range by calling update_mmu_tlb() in a loop.  Using the
update_mmu_tlb_range(), we can simplify the code and possibly reduce the
execution of some unnecessary code in some architectures.


This patch (of 3):

Add update_mmu_tlb_range(), we can batch update tlb of an address range.

Link: https://lkml.kernel.org/r/20240522061204.117421-1-libang.li@antgroup.com
Link: https://lkml.kernel.org/r/20240522061204.117421-2-libang.li@antgroup.com
Signed-off-by: Bang Li <libang.li@antgroup.com>
Acked-by: David Hildenbrand <david@redhat.com>
Cc: Chris Zankel <chris@zankel.net>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Lance Yang <ioworker0@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2024-07-03 19:29:57 -07:00
Song Shuai
c562ba719d
riscv: kexec: Avoid deadlock in kexec crash path
If the kexec crash code is called in the interrupt context, the
machine_kexec_mask_interrupts() function will trigger a deadlock while
trying to acquire the irqdesc spinlock and then deactivate irqchip in
irq_set_irqchip_state() function.

Unlike arm64, riscv only requires irq_eoi handler to complete EOI and
keeping irq_set_irqchip_state() will only leave this possible deadlock
without any use. So we simply remove it.

Link: https://lore.kernel.org/linux-riscv/20231208111015.173237-1-songshuaishuai@tinylab.org/
Fixes: b17d19a531 ("riscv: kexec: Fixup irq controller broken in kexec crash path")
Signed-off-by: Song Shuai <songshuaishuai@tinylab.org>
Reviewed-by: Ryo Takakura <takakura@valinux.co.jp>
Link: https://lore.kernel.org/r/20240626023316.539971-1-songshuaishuai@tinylab.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03 13:11:30 -07:00
Puranjay Mohan
393da6cbb2
riscv: stacktrace: fix usage of ftrace_graph_ret_addr()
ftrace_graph_ret_addr() takes an `idx` integer pointer that is used to
optimize the stack unwinding. Pass it a valid pointer to utilize the
optimizations that might be available in the future.

The commit is making riscv's usage of ftrace_graph_ret_addr() match
x86_64.

Signed-off-by: Puranjay Mohan <puranjay@kernel.org>
Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>
Link: https://lore.kernel.org/r/20240618145820.62112-1-puranjay@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03 13:10:03 -07:00
Palmer Dabbelt
210ac17ded
Merge patch series "Assorted fixes in RISC-V PMU driver"
Atish Patra <atishp@rivosinc.com> says:

This series contains 3 fixes out of which the first one is a new fix
for invalid event data reported in lkml[2]. The last two are v3 of Samuel's
patch[1]. I added the RB/TB/Fixes tag and moved 1 unrelated change
to its own patch. I also changed an error message in kvm vcpu_pmu from
pr_err to pr_debug to avoid redundant failure error messages generated
due to the boot time quering of events implemented in the patch[1]

Here is the original cover letter for the patch[1]

Before this patch:
$ perf list hw

List of pre-defined events (to be used in -e or -M):

  branch-instructions OR branches                    [Hardware event]
  branch-misses                                      [Hardware event]
  bus-cycles                                         [Hardware event]
  cache-misses                                       [Hardware event]
  cache-references                                   [Hardware event]
  cpu-cycles OR cycles                               [Hardware event]
  instructions                                       [Hardware event]
  ref-cycles                                         [Hardware event]
  stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
  stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]

$ perf stat -ddd true

 Performance counter stats for 'true':

              4.36 msec task-clock                       #    0.744 CPUs utilized
                 1      context-switches                 #  229.325 /sec
                 0      cpu-migrations                   #    0.000 /sec
                38      page-faults                      #    8.714 K/sec
         4,375,694      cycles                           #    1.003 GHz                         (60.64%)
           728,945      instructions                     #    0.17  insn per cycle
            79,199      branches                         #   18.162 M/sec
            17,709      branch-misses                    #   22.36% of all branches
           181,734      L1-dcache-loads                  #   41.676 M/sec
             5,547      L1-dcache-load-misses            #    3.05% of all L1-dcache accesses
     <not counted>      LLC-loads                                                               (0.00%)
     <not counted>      LLC-load-misses                                                         (0.00%)
     <not counted>      L1-icache-loads                                                         (0.00%)
     <not counted>      L1-icache-load-misses                                                   (0.00%)
     <not counted>      dTLB-loads                                                              (0.00%)
     <not counted>      dTLB-load-misses                                                        (0.00%)
     <not counted>      iTLB-loads                                                              (0.00%)
     <not counted>      iTLB-load-misses                                                        (0.00%)
     <not counted>      L1-dcache-prefetches                                                    (0.00%)
     <not counted>      L1-dcache-prefetch-misses                                               (0.00%)

       0.005860375 seconds time elapsed

       0.000000000 seconds user
       0.010383000 seconds sys

After this patch:
$ perf list hw

List of pre-defined events (to be used in -e or -M):

  branch-instructions OR branches                    [Hardware event]
  branch-misses                                      [Hardware event]
  cache-misses                                       [Hardware event]
  cache-references                                   [Hardware event]
  cpu-cycles OR cycles                               [Hardware event]
  instructions                                       [Hardware event]

$ perf stat -ddd true

 Performance counter stats for 'true':

              5.16 msec task-clock                       #    0.848 CPUs utilized
                 1      context-switches                 #  193.817 /sec
                 0      cpu-migrations                   #    0.000 /sec
                37      page-faults                      #    7.171 K/sec
         5,183,625      cycles                           #    1.005 GHz
           961,696      instructions                     #    0.19  insn per cycle
            85,853      branches                         #   16.640 M/sec
            20,462      branch-misses                    #   23.83% of all branches
           243,545      L1-dcache-loads                  #   47.203 M/sec
             5,974      L1-dcache-load-misses            #    2.45% of all L1-dcache accesses
   <not supported>      LLC-loads
   <not supported>      LLC-load-misses
   <not supported>      L1-icache-loads
   <not supported>      L1-icache-load-misses
   <not supported>      dTLB-loads
            19,619      dTLB-load-misses
   <not supported>      iTLB-loads
             6,831      iTLB-load-misses
   <not supported>      L1-dcache-prefetches
   <not supported>      L1-dcache-prefetch-misses

       0.006085625 seconds time elapsed

       0.000000000 seconds user
       0.013022000 seconds sys

[1] https://lore.kernel.org/linux-riscv/20240418014652.1143466-1-samuel.holland@sifive.com/
[2] https://lore.kernel.org/all/CC51D53B-846C-4D81-86FC-FBF969D0A0D6@pku.edu.cn/

* b4-shazam-merge:
  perf: RISC-V: Check standard event availability
  drivers/perf: riscv: Reset the counter to hpmevent mapping while starting cpus
  drivers/perf: riscv: Do not update the event data if uptodate

Link: https://lore.kernel.org/r/20240628-misc_perf_fixes-v4-0-e01cfddcf035@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03 12:57:41 -07:00
Samuel Holland
16d3b1af09
perf: RISC-V: Check standard event availability
The RISC-V SBI PMU specification defines several standard hardware and
cache events. Currently, all of these events are exposed to userspace,
even when not actually implemented. They appear in the `perf list`
output, and commands like `perf stat` try to use them.

This is more than just a cosmetic issue, because the PMU driver's .add
function fails for these events, which causes pmu_groups_sched_in() to
prematurely stop scheduling in other (possibly valid) hardware events.

Add logic to check which events are supported by the hardware (i.e. can
be mapped to some counter), so only usable events are reported to
userspace. Since the kernel does not know the mapping between events and
possible counters, this check must happen during boot, when no counters
are in use. Make the check asynchronous to minimize impact on boot time.

Fixes: e999143459 ("RISC-V: Add perf platform driver based on SBI PMU extension")

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Tested-by: Atish Patra <atishp@rivosinc.com>
Signed-off-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240628-misc_perf_fixes-v4-3-e01cfddcf035@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-07-03 12:56:22 -07:00
Pu Lehui
6801b0aef7 riscv, bpf: Add 12-argument support for RV64 bpf trampoline
This patch adds 12 function arguments support for riscv64 bpf trampoline.
The current bpf trampoline supports <= sizeof(u64) bytes scalar arguments [0]
and <= 16 bytes struct arguments [1]. Therefore, we focus on the situation
where scalars are at most XLEN bits and aggregates whose total size does not
exceed 2×XLEN bits in the riscv calling convention [2].

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Acked-by: Björn Töpel <bjorn@kernel.org>
Acked-by: Puranjay Mohan <puranjay@kernel.org>
Link: https://elixir.bootlin.com/linux/v6.8/source/kernel/bpf/btf.c#L6184 [0]
Link: https://elixir.bootlin.com/linux/v6.8/source/kernel/bpf/btf.c#L6769 [1]
Link: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/releases/download/draft-20230929-e5c800e661a53efe3c2678d71a306323b60eb13b/riscv-abi.pdf [2]
Link: https://lore.kernel.org/bpf/20240702121944.1091530-2-pulehui@huaweicloud.com
2024-07-02 16:01:30 +02:00
Linus Torvalds
651ab78190 SoC fixes for 6.10, part 2
A number of devicetree fixes came in for the rockchip platforms, correcting
 some of the address information, and reverting a change to the MMC controller
 configuration that caused regressions.
 
 Four drivers have one code change each, addressing minor build issues for
 the optee firmware driver, the litex SoC platform driver and two reset
 drivers.
 
 The riscv fixes as also simple, mainly turning off device nodes in the
 canaan dts files unless they are actually usable on a particular board.
 
 Finally, Drew takes over maintaining the THEAD RISC-V SoC platform.
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmaCpVIACgkQYKtH/8kJ
 UieIfQ/+KtxzYPfLsUgSZJCeKa6d1A9EgtTBtcEn6gI4HAc5NFDGvYIIetWI/RYN
 2+zOLRdQ8t3CIi2c1sTqy1m+j5vX94p4/2WSW41zASHxN+ryz8VM2SKzE5TGV8IE
 WyHv5fBIHY97u2zegq8K4c/ze2W7bdwBd1V5tYwk7tZnm6VFNMCESQ+Q4mu7kjdy
 irCdxr0j+uDO0cGppwuGWSSR+BiCCCDhu9YjmluD9B57IIB95lyQRgdGy/V9cT9D
 yJS6VwEi+EFBNbzt7TzNrPiXvymQzDeC5K7JavfSRRxW1a/rWLmkmxripiSVZirf
 nHR7cIivj0gvjeZiM3UH/ZMPUdzRk4YXr9889EbO+JQ/iZy/1YIJHKuqLbCjAQZp
 RgjuYYAuG91aHeCHN6cSflNoC49FmmBi9k1GkBdWviU0mhBrfaNfqspMkWUuH34g
 w49H9iflFzJsh6p18h2wG3pB28zubUhLfpmbG5EOP4kXBYvgegAsY9mgSNWcdFFk
 JBxcJlToCm6ooha8bBCpsnDf8/Y1LMlUIcP2Spd/iZ0pxtF2wR8Pt80KDuVko+fj
 TZF3UTu3wuDCyHnSvX0Fc0YTQV0huVawd23upWBxpcdOa4vsScnUbMdKqBIjA6TX
 cGK+fCtVSk45KoAlRi4eLcnNmgse1G2J+ujcrs0QakLxdCx+Ggk=
 =+O9s
 -----END PGP SIGNATURE-----

Merge tag 'arm-fixes-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc

Pull SoC fixes from Arnd Bergmann:
 "A number of devicetree fixes came in for the rockchip platforms,
  correcting some of the address information, and reverting a change to
  the MMC controller configuration that caused regressions.

  Four drivers have one code change each, addressing minor build issues
  for the optee firmware driver, the litex SoC platform driver and two
  reset drivers.

  The riscv fixes as also simple, mainly turning off device nodes in the
  canaan dts files unless they are actually usable on a particular
  board.

  Finally, Drew takes over maintaining the THEAD RISC-V SoC platform"

* tag 'arm-fixes-6.10-2' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc:
  drivers/soc/litex: drop obsolete dependency on COMPILE_TEST
  tee: optee: ffa: Fix missing-field-initializers warning
  arm64: dts: rockchip: Add sound-dai-cells for RK3368
  arm64: dts: rockchip: Fix the i2c address of es8316 on Cool Pi 4B
  reset: hisilicon: hi6220: add missing MODULE_DESCRIPTION() macro
  reset: gpio: Fix missing gpiolib dependency for GPIO reset controller
  MAINTAINERS: thead: update Maintainer
  arm64: dts: rockchip: fix PMIC interrupt pin on ROCK Pi E
  riscv: dts: starfive: Set EMMC vqmmc maximum voltage to 3.3V on JH7110 boards
  arm64: dts: rockchip: make poweroff(8) work on Radxa ROCK 5A
  Revert "arm64: dts: rockchip: remove redundant cd-gpios from rk3588 sdmmc nodes"
  ARM: dts: rockchip: rk3066a: add #sound-dai-cells to hdmi node
  arm64: dts: rockchip: Fix the value of `dlg,jack-det-rate` mismatch on rk3399-gru
  arm64: dts: rockchip: set correct pwm0 pinctrl on rk3588-tiger
  riscv: dts: canaan: Disable I/O devices unless used
  riscv: dts: canaan: Clean up serial aliases
  arm64: dts: rockchip: Rename LED related pinctrl nodes on rk3308-rock-pi-s
  arm64: dts: rockchip: Fix SD NAND and eMMC init on rk3308-rock-pi-s
  arm64: dts: rockchip: Fix rk3308 codec@ff560000 reset-names
  arm64: dts: rockchip: Fix the DCDC_REG2 minimum voltage on Quartz64 Model B
2024-07-01 09:36:20 -07:00
Pu Lehui
2382a405c5 riscv, bpf: Use bpf_prog_pack for RV64 bpf trampoline
We used bpf_prog_pack to aggregate bpf programs into huge page to
relieve the iTLB pressure on the system. We can apply it to bpf
trampoline, as Song had been implemented it in core and x86 [0]. This
patch is going to use bpf_prog_pack to RV64 bpf trampoline. Since Song
and Puranjay have done a lot of work for bpf_prog_pack on RV64,
implementing this function will be easy.

Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Björn Töpel <bjorn@rivosinc.com> #riscv
Link: https://lore.kernel.org/all/20231206224054.492250-1-song@kernel.org [0]
Link: https://lore.kernel.org/bpf/20240622030437.3973492-4-pulehui@huaweicloud.com
2024-07-01 17:10:46 +02:00
Pu Lehui
9f1e16fb1f riscv, bpf: Fix out-of-bounds issue when preparing trampoline image
We get the size of the trampoline image during the dry run phase and
allocate memory based on that size. The allocated image will then be
populated with instructions during the real patch phase. But after
commit 26ef208c20 ("bpf: Use arch_bpf_trampoline_size"), the `im`
argument is inconsistent in the dry run and real patch phase. This may
cause emit_imm in RV64 to generate a different number of instructions
when generating the 'im' address, potentially causing out-of-bounds
issues. Let's emit the maximum number of instructions for the "im"
address during dry run to fix this problem.

Fixes: 26ef208c20 ("bpf: Use arch_bpf_trampoline_size")
Signed-off-by: Pu Lehui <pulehui@huawei.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Link: https://lore.kernel.org/bpf/20240622030437.3973492-3-pulehui@huaweicloud.com
2024-07-01 17:10:46 +02:00
Minda Chen
2904244a8c riscv: dts: starfive: add PCIe dts configuration for JH7110
Add PCIe dts configuraion for JH7110 SoC platform. The Star64 only has
one exposed PCIe port, so only the Mars and VisionFive 2 get two
enabled.

Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
Reviewed-by: Hal Feng <hal.feng@starfivetech.com>
[conor: squash in star64's single exposed port]
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-07-01 13:20:19 +01:00
Greg Kroah-Hartman
33827dc4ad Linux 6.10-rc6
-----BEGIN PGP SIGNATURE-----
 
 iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmaB0NweHHRvcnZhbGRz
 QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGkvwH/36UJRk/o6wvXnyH
 E6QjCSWo2226APyWks22NjtC3I/8Iqdvkneuh6wG0qL2sXAB078EMjUq5R81bF8H
 wWFBJwetjYTp8GEyLioMEb2wCH/J3R29dLFC4UYTplafXRGP6//xcpJaKmTxcgdR
 31IzvTPXbApZ7L3k1U6rA2bK9PNKcFCOvZlrNMUCuwMrabymHsDfOUt1DqXyg2xp
 zjqiWYBwlklozmgawSWt/mdEgkWuTcAbg+KyqDVQF59s9aj/OOwZ0j+HACq5V8CM
 quTPIAYL6CC9p7uxa69lGr/sgC0Is/BZLPX7RTZAwCgarGvnX+1HUsjDcaFCtrVg
 O6fPUV8=
 =pgUx
 -----END PGP SIGNATURE-----

Merge 6.10-rc6 into tty-next

This resolves the merge issues in the 8250 code due to some reverts in
6.10-rc6 in the console changes.

Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-07-01 14:16:48 +02:00
Samuel Holland
0ce1d34678 riscv: dts: allwinner: Add ClockworkPi and DevTerm devicetrees
Clockwork Tech manufactures several SoMs for their RasPi CM3-compatible
"ClockworkPi" mainboard. Their R-01 SoM features the Allwinner D1 SoC.
The R-01 contains only the CPU, DRAM, and always-on voltage regulation;
it does not merit a separate devicetree.

The ClockworkPi mainboard features analog audio, a MIPI-DSI panel, USB
host and peripheral ports, an Ampak AP6256 WiFi/Bluetooth module, and an
X-Powers AXP228 PMIC for managing a Li-ion battery.

The DevTerm is a complete system which extends the ClockworkPi mainboard
with a MIPI-DSI panel and a pair of expansion boards. These expansion
boards provide a fan, a USB keyboard, speakers, and a thermal printer.

Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Samuel Holland <samuel@sholland.org>
Link: https://lore.kernel.org/r/20240622150731.1105901-4-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2024-06-30 23:06:52 +08:00
Chen-Yu Tsai
8f2cf4442b riscv: dts: allwinner: d1s-t113: Add system LDOs
Now that the bindings for the system LDOs have been merged, the nodes
for the system LDOs can be added. These are used on the ClockworkPi.

This was originally part of Samuel's D1 device tree series [1], but was
dropped in v5 as the regulator bindings weren't merged at the time.

[1] https://lore.kernel.org/linux-sunxi/20221231233851.24923-1-samuel@sholland.org/

Link: https://lore.kernel.org/r/20240622150731.1105901-3-wens@kernel.org
Signed-off-by: Chen-Yu Tsai <wens@csie.org>
2024-06-30 23:06:51 +08:00
Linus Torvalds
de0a9f4486 RISC-V Fixes for 6.10-rc6
* A fix for vector load/store instruction decoding, which could result
   in reserved vector element length encodings decoding as valid vector
   instructions.
 * Instruction patching now aggressively flushes the local instruction
   cache, to avoid situations where patching functions on the flush path
   results in torn instructions being fetched.
 * A fix to prevent the stack walker from showing up as part of traces.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmZ+4zUTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiTXxD/wPSWbHf24Mr4CrFYbKR7lHWjGku+jG
 8LQa+B9uUgpA8XjNjeeECg7lsJq/1avbPrUlValRckUIMZPHSWK4U7aFkkPs1WFa
 87D5pA4AVkt5U8v/3c5GQ8Tod0Afa7OyFggxdglC3XFvUa5TNdn3pdv0rdE4Mx5l
 QRijFyLlhRv/D5We+exNAVJmkdHfSXQEEyEjXeb83VK+PsZzAXvHLj3omxCyQ1kH
 Kt2RyN8QpkUisFNTpSvPHiuoPjUeJvRs0JIyrO1SwBGHyYs4kg6g0KBk4YjTTXG5
 xbRVG2tMO9TS2jRRHJS7fdI+yF0X1z+t+WUG1F1WHvKxnqbWczJ0UczFgacaVSkw
 BM/yO+VPg22f6bM4H85K5GBdhN8PplFDuHDdVQ8/LDGOrQKaByrorXWq3WrbwJcq
 vVwbnBGW6v4we5COzyHvwnakzl4bEMHoUb2NVTzZM+fFleiEdx4Sg+F5Us8+UlZh
 PztwjPao8spIm81l/wXStxYzschDyonCt74/odic2LDtFBirZWzDdUInXFVzUZs7
 CUxF38XJ6SNQJBVVwQv6qisoWhy6Ca9SGKwY2GwW7Ustx3C4Eh0nrOVmI/DcHRgN
 9rGm13Qfm8eUSznTM+buWTTluZvtmZupGpAP2GhvaUgTMDIfK/vttidW9Kf4KsP8
 hn+jllIc1WgE6Q==
 =0bpZ
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - A fix for vector load/store instruction decoding, which could result
   in reserved vector element length encodings decoding as valid vector
   instructions.

 - Instruction patching now aggressively flushes the local instruction
   cache, to avoid situations where patching functions on the flush path
   results in torn instructions being fetched.

 - A fix to prevent the stack walker from showing up as part of traces.

* tag 'riscv-for-linus-6.10-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: stacktrace: convert arch_stack_walk() to noinstr
  riscv: patch: Flush the icache right after patching to avoid illegal insns
  RISC-V: fix vector insn load/store width mask
2024-06-28 16:14:59 -07:00
Jakub Kicinski
193b9b2002 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:
  e3f02f32a0 ("ionic: fix kernel panic due to multi-buffer handling")
  d9c0420999 ("ionic: Mark error paths in the data path as unlikely")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-27 12:14:11 -07:00
Arnd Bergmann
53ed12744c RISC-V Devicetree fixes for v6.10-rc5+
T-Head:
 Jisheng hasn't got enough time to look after the platform, so Drew
 Fustini is going to take over.
 
 StarFive:
 A fix for a regulator voltage range that prevented using low performance
 SD cards.
 
 Canaan:
 Cleanup for some "over eager" aliases for serial ports that did not
 exist on some boards and I/O devices disabled on boards where they were
 not actually in use.
 
 Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQRh246EGq/8RLhDjO14tDGHoIJi0gUCZnWJ7wAKCRB4tDGHoIJi
 0pYNAP9f3zNfNJF+nIm10LzLnrX3U/sqYZaFlTPqpzQCpvdtTgEAvaxfN5x6zcR9
 6obpBSAlYN+7OlFVEtMfUf6UYq2rwwA=
 =AQmx
 -----END PGP SIGNATURE-----

Merge tag 'riscv-dt-fixes-for-v6.10-rc5+' of https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux into arm/fixes

RISC-V Devicetree fixes for v6.10-rc5+

T-Head:
Jisheng hasn't got enough time to look after the platform, so Drew
Fustini is going to take over.

StarFive:
A fix for a regulator voltage range that prevented using low performance
SD cards.

Canaan:
Cleanup for some "over eager" aliases for serial ports that did not
exist on some boards and I/O devices disabled on boards where they were
not actually in use.

Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-27 16:09:13 +02:00
Palmer Dabbelt
60a6707f58
Merge patch series "riscv: Memory Hot(Un)Plug support"
Björn Töpel <bjorn@kernel.org> says:

From: Björn Töpel <bjorn@rivosinc.com>

================================================================
Memory Hot(Un)Plug support (and ZONE_DEVICE) for the RISC-V port
================================================================

Introduction
============

To quote "Documentation/admin-guide/mm/memory-hotplug.rst": "Memory
hot(un)plug allows for increasing and decreasing the size of physical
memory available to a machine at runtime."

This series adds memory hot(un)plugging, and ZONE_DEVICE support for
the RISC-V Linux port.

MM configuration
================

RISC-V MM has the following configuration:

 * Memory blocks are 128M, analogous to x86-64. It uses PMD
   ("hugepage") vmemmaps. From that follows that 2M (PMD) worth of
   vmemmap spans 32768 pages á 4K which gets us 128M.

 * The pageblock size is the minimum minimum virtio_mem size, and on
   RISC-V it's 2M (2^9 * 4K).

Implementation
==============

The PGD table on RISC-V is shared/copied between for all processes. To
avoid doing page table synchronization, the first patch (patch 1)
pre-allocated the PGD entries for vmemmap/direct map. By doing that
the init_mm PGD will be fixed at kernel init, and synchronization can
be avoided all together.

The following two patches (patch 2-3) does some preparations, followed
by the actual MHP implementation (patch 4-5). Then, MHP and virtio-mem
are enabled (patch 6-7), and finally ZONE_DEVICE support is added
(patch 8).

MHP and locking
===============

TL;DR: The MHP does not step on any toes, except for ptdump.
Additional locking is required for ptdump.

Long version: For v2 I spent some time digging into init_mm
synchronization/update. Here are my findings, and I'd love them to be
corrected if incorrect.

It's been a gnarly path...

The `init_mm` structure is a special mm (perhaps not a "real" one).
It's a "lazy context" that tracks kernel page table resources, e.g.,
the kernel page table (swapper_pg_dir), a kernel page_table_lock (more
about the usage below), mmap_lock, and such.

`init_mm` does not track/contain any VMAs. Having the `init_mm` is
convenient, so that the regular kernel page table walk/modify
functions can be used.

Now, `init_mm` being special means that the locking for kernel page
tables are special as well.

On RISC-V the PGD (top-level page table structure), similar to x86, is
shared (copied) with user processes. If the kernel PGD is modified, it
has to be synched to user-mode processes PGDs. This is avoided by
pre-populating the PGD, so it'll be fixed from boot.

The in-kernel pgd regions are documented in
`Documentation/arch/riscv/vm-layout.rst`.

The distinct regions are:
 * vmemmap
 * vmalloc/ioremap space
 * direct mapping of all physical memory
 * kasan
 * modules, BPF
 * kernel

Memory hotplug is the process of adding/removing memory to/from the
kernel.

Adding is done in two phases:
 1. Add the memory to the kernel
 2. Online memory, making it available to the page allocator.

Step 1 is partially architecture dependent, and updates the init_mm
page table:
 * Update the direct map page tables. The direct map is a linear map,
   representing all physical memory: `virt = phys + PAGE_OFFSET`
 * Add a `struct page` for each added page of memory. Update the
   vmemmap (virtual mapping to the `struct page`, so we can easily
   transform a kernel virtual address to a `struct page *` address.

From an MHP perspective, there are two regions of the PGD that are
updated:
 * vmemmap
 * direct mapping of all physical memory

The `struct mm_struct` has a couple of locks in play:
 * `spinlock_t page_table_lock` protects the page table, and some
    counters
 * `struct rw_semaphore mmap_lock` protect an mm's VMAs

Note again that `init_mm` does not contain any VMAs, but still uses
the mmap_lock in some places.

The `page_table_lock` was originally used to to protect all pages
tables, but more recently a split page table lock has been introduced.
The split lock has a per-table lock for the PTE and PMD tables. If
split lock is disabled, all tables are guarded by
`mm->page_table_lock` (for user processes). Split page table locks are
not used for init_mm.

MHP operations is typically synchronized using
`DEFINE_STATIC_PERCPU_RWSEM(mem_hotplug_lock)`.

Actors
------

The following non-MHP actors in the kernel traverses (read), and/or
modifies the kernel PGD.

 * `ptdump`

   Walks the entire `init_mm`, via `ptdump_walk_pgd()` with the
   `mmap_write_lock(init_mm)` taken.

   Observation: ptdump can race with MHP, and needs additional locking
   to avoid crashes/races.

 * `set_direct_*` / `arch/riscv/mm/pageattr.c`

   The `set_direct_*` functionality is used to "synchronize" the
   direct map to other kernel mappings, e.g. modules/kernel text. The
   direct map is using "as large huge table mappings as possible",
   which means that the `set_direct_*` might need to split the direct
   map.

  The `set_direct_*` functions operates with the
  `mmap_write_lock(init_mm)` taken.

  Observation: `set_direct_*` uses the direct map, but will never
  modify the same entry as MHP. If there is a mapping, that entry will
  never race with MHP. Further, MHP acts when memory is offline.

 * HVO / `mm/hugetlb_vmemmap`

   HVO optimizes the backing `struct page` for hugetlb pages, which
   means changing the "vmemmap" region. HVO can split (merge?) a
   vmemmap pmd. However, it will never race with MHP, since HVO only
   operates at online memory. HVO cannot touch memory being MHP added
   or removed.

 * `apply_to_page_range`

   Walks a range, creates pages and applies a callback (setting
   permissions) for the page.

   When creating a table, it might use `int __pte_alloc_kernel(pmd_t
   *pmd)` which takes the `init_mm.page_table_lock` to synchronize pmd
   populate.

   Used by: `mm/vmalloc.c` and `mm/kasan/shadow.c`. The KASAN callback
   takes the `init_mm.page_table_lock` to synchronize pte creation.

   Observations: `apply_to_page_range` applies to the "vmalloc/ioremap
   space" region, and "kasan" region. *Not* affected by MHP.

 * `apply_to_existing_page_range`

   Walks a range, applies a callback (setting permissions) for the
   page (no page creation).

   Used by: `kernel/bpf/arena.c` and `mm/kasan/shadow.c`. The KASAN
   callback takes the `init_mm.page_table_lock` to synchronize pte
   creation. *Not* affected by MHP regions.

 * `apply_to_existing_page_range` applies to the "vmalloc/ioremap
   space" region, and "kasan" region. *Not* affected by MHP regions.

 *  `ioremap_page_range` and `vmap_page_range`

    Uses the same internal function, and might create table entries at
    the "vmalloc/ioremap space" region. Can call
    `__pte_alloc_kernel()` which takes the `init_mm.page_table_lock`
    synchronizing pmd populate in the region. *Not* affected by MHP
    regions.

Summary:
  * MHP add will never modify the same page table entries, as any of
    the other actors.
  * MHP remove is done when memory is offlined, and will not clash
    with any of the actors.
  * Functions that walk the entire kernel page table need
    synchronization

  * It's sufficient to add the MHP lock ptdump.

Testing
=======

This series adds basic DT supported hotplugging. There is a QEMU
series enabling MHP for the RISC-V "virt" machine here: [1]

ACPI/MSI support is still in the making for RISC-V, and prior proper
(ACPI) PCI MSI support lands [2] and NUMA SRAT support [3], it hard to
try it out.

I've prepared a QEMU branch with proper ACPI GED/PC-DIMM support [4],
and a this series with the required prerequisites [5] (AIA, ACPI AIA
MADT, ACPI NUMA SRAT).

To test with virtio-mem, e.g.:
  | qemu-system-riscv64 \
  |     -machine virt,aia=aplic-imsic \
  |     -cpu rv64,v=true,vlen=256,elen=64,h=true,zbkb=on,zbkc=on,zbkx=on,zkr=on,zkt=on,svinval=on,svnapot=on,svpbmt=on \
  |     -nodefaults \
  |     -nographic -smp 8 -kernel rv64-u-boot.bin \
  |     -drive file=rootfs.img,format=raw,if=virtio \
  |     -device virtio-rng-pci \
  |     -m 16G,slots=3,maxmem=32G \
  |     -object memory-backend-ram,id=mem0,size=16G \
  |     -numa node,nodeid=0,memdev=mem0 \
  |     -serial chardev:char0 \
  |     -mon chardev=char0,mode=readline \
  |     -chardev stdio,mux=on,id=char0 \
  |     -device pci-serial,id=serial0,chardev=char0 \
  |     -object memory-backend-ram,id=vmem0,size=2G \
  |     -device virtio-mem-pci,id=vm0,memdev=vmem0,node=0

where "rv64-u-boot.bin" is U-boot with EFI/ACPI-support (use [6] if
you're lazy).

In the QEMU monitor:
  | (qemu) info memory-devices
  | (qemu) qom-set vm0 requested-size 1G

...to test DAX/KMEM, use the follow QEMU parameters:
  |  -object memory-backend-file,id=mem1,share=on,mem-path=virtio_pmem.img,size=4G \
  |  -device virtio-pmem-pci,memdev=mem1,id=nv1

and the regular ndctl/daxctl dance.

If you're brave to try the ACPI branch, add "acpi=on" to "-machine
virt", and test PC-DIMM MHP (in addition to virtio-{p},mem):

In the QEMU monitor:
  | (qemu) object_add memory-backend-ram,id=mem1,size=1G
  | (qemu) device_add pc-dimm,id=dimm1,memdev=mem1

You can also try hot-remove with some QEMU options, say:
  | -object memory-backend-file,id=mem-1,size=256M,mem-path=/pagesize-2MB
  | -device pc-dimm,id=mem1,memdev=mem-1
  | -object memory-backend-file,id=mem-2,size=1G,mem-path=/pagesize-1GB
  | -device pc-dimm,id=mem2,memdev=mem-2
  | -object memory-backend-file,id=mem-3,size=256M,mem-path=/pagesize-2MB
  | -device pc-dimm,id=mem3,memdev=mem-3

Remove "acpi=on" to run with DT.

Thanks to Alex, Andrew, David, and Oscar for all
comments/tests/fixups.

References
==========

[1] https://lore.kernel.org/qemu-devel/20240521105635.795211-1-bjorn@kernel.org/
[2] https://lore.kernel.org/linux-riscv/20240501121742.1215792-1-sunilvl@ventanamicro.com/
[3] https://lore.kernel.org/linux-riscv/cover.1713778236.git.haibo1.xu@intel.com/
[4] https://github.com/bjoto/qemu/commits/virtio-mem-pc-dimm-mhp-acpi-v2/
[5] https://github.com/bjoto/linux/commits/mhp-v4-acpi
[6] https://github.com/bjoto/riscv-rootfs-utils/tree/acpi

* b4-shazam-merge:
  riscv: Enable DAX VMEMMAP optimization
  riscv: mm: Add support for ZONE_DEVICE
  virtio-mem: Enable virtio-mem for RISC-V
  riscv: Enable memory hotplugging for RISC-V
  riscv: mm: Take memory hotplug read-lock during kernel page table dump
  riscv: mm: Add memory hotplugging support
  riscv: mm: Add pfn_to_kaddr() implementation
  riscv: mm: Refactor create_linear_mapping_range() for memory hot add
  riscv: mm: Change attribute from __init to __meminit for page functions
  riscv: mm: Pre-allocate vmemmap/direct map/kasan PGD entries
  riscv: mm: Properly forward vmemmap_populate() altmap parameter

Link: https://lore.kernel.org/r/20240605114100.315918-1-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:43:07 -07:00
Björn Töpel
4705c1571a
riscv: Enable DAX VMEMMAP optimization
Now that DAX is usable, enable the DAX VMEMMAP optimization as well.

Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-12-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:47 -07:00
Björn Töpel
216e04bf1e
riscv: mm: Add support for ZONE_DEVICE
ZONE_DEVICE pages need DEVMAP PTEs support to function
(ARCH_HAS_PTE_DEVMAP). Claim another RSW (reserved for software) bit
in the PTE for DEVMAP mark, add the corresponding helpers, and enable
ARCH_HAS_PTE_DEVMAP for riscv64.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-11-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:46 -07:00
Björn Töpel
f8c2a24055
riscv: Enable memory hotplugging for RISC-V
Enable ARCH_ENABLE_MEMORY_HOTPLUG and ARCH_ENABLE_MEMORY_HOTREMOVE for
RISC-V.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-9-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:44 -07:00
Björn Töpel
37992b7f10
riscv: mm: Take memory hotplug read-lock during kernel page table dump
During memory hot remove, the ptdump functionality can end up touching
stale data. Avoid any potential crashes (or worse), by holding the
memory hotplug read-lock while traversing the page table.

This change is analogous to arm64's commit bf2b59f60e ("arm64/mm:
Hold memory hotplug lock while walking for kernel page table dump").

Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-8-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:43 -07:00
Björn Töpel
c75a74f4ba
riscv: mm: Add memory hotplugging support
For an architecture to support memory hotplugging, a couple of
callbacks needs to be implemented:

 arch_add_memory()
  This callback is responsible for adding the physical memory into the
  direct map, and call into the memory hotplugging generic code via
  __add_pages() that adds the corresponding struct page entries, and
  updates the vmemmap mapping.

 arch_remove_memory()
  This is the inverse of the callback above.

 vmemmap_free()
  This function tears down the vmemmap mappings (if
  CONFIG_SPARSEMEM_VMEMMAP is enabled), and also deallocates the
  backing vmemmap pages. Note that for persistent memory, an
  alternative allocator for the backing pages can be used; The
  vmem_altmap. This means that when the backing pages are cleared,
  extra care is needed so that the correct deallocation method is
  used.

 arch_get_mappable_range()
  This functions returns the PA range that the direct map can map.
  Used by the MHP internals for sanity checks.

The page table unmap/teardown functions are heavily based on code from
the x86 tree. The same remove_pgd_mapping() function is used in both
vmemmap_free() and arch_remove_memory(), but in the latter function
the backing pages are not removed.

Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-7-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:42 -07:00
Björn Töpel
6e6c5e21b8
riscv: mm: Add pfn_to_kaddr() implementation
The pfn_to_kaddr() function is used by KASAN's memory hotplugging
path. Add the missing function to the RISC-V port, so that it can be
built with MHP and CONFIG_KASAN.

Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-6-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:41 -07:00
Björn Töpel
007480fe84
riscv: mm: Refactor create_linear_mapping_range() for memory hot add
Add a parameter to the direct map setup function, so it can be used in
arch_add_memory() later.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-5-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:40 -07:00
Björn Töpel
fe122b89da
riscv: mm: Change attribute from __init to __meminit for page functions
Prepare for memory hotplugging support by changing from __init to
__meminit for the page table functions that are used by the upcoming
architecture specific callbacks.

Changing the __init attribute to __meminit, avoids that the functions
are removed after init. The __meminit attribute makes sure the
functions are kept in the kernel text post init, but only if memory
hotplugging is enabled for the build.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Oscar Salvador <osalvador@suse.de>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-4-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:39 -07:00
Björn Töpel
66673099f7
riscv: mm: Pre-allocate vmemmap/direct map/kasan PGD entries
The RISC-V port copies the PGD table from init_mm/swapper_pg_dir to
all userland page tables, which means that if the PGD level table is
changed, other page tables has to be updated as well.

Instead of having the PGD changes ripple out to all tables, the
synchronization can be avoided by pre-allocating the PGD entries/pages
at boot, avoiding the synchronization all together.

This is currently done for the bpf/modules, and vmalloc PGD regions.
Extend this scheme for the PGD regions touched by memory hotplugging.

Prepare the RISC-V port for memory hotplug by pre-allocate
vmemmap/direct map/kasan entries at the PGD level. This will roughly
waste ~128 (plus 32 if KASAN is enabled) worth of 4K pages when memory
hotplugging is enabled in the kernel configuration.

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-3-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:39 -07:00
Björn Töpel
e3ecf2fdc8
riscv: mm: Properly forward vmemmap_populate() altmap parameter
Make sure that the altmap parameter is properly passed on to
vmemmap_populate_hugepages().

Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Signed-off-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240605114100.315918-2-bjorn@kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:42:37 -07:00
Haibo Xu
d6ecd18893
riscv: dmi: Add SMBIOS/DMI support
Enable the dmi driver for riscv which would allow access the
SMBIOS info through some userspace file(/sys/firmware/dmi/*).

The change was based on that of arm64 and has been verified
by dmidecode tool.

Signed-off-by: Haibo Xu <haibo1.xu@intel.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240613065507.287577-1-haibo1.xu@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 08:02:33 -07:00
Alexandre Ghiti
50b5bae5be
riscv: Implement pte_accessible()
Like other architectures, a pte is accessible if it is present or if
there is a pending tlb flush and the pte is protnone (which could be the
case when a pte is downgraded to protnone before a flush tlb is
executed).

Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240128115953.25085-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:56:26 -07:00
Palmer Dabbelt
914e618b43
Merge patch series "Add support for a few Zc* extensions, Zcmop and Zimop"
Clément Léger <cleger@rivosinc.com> says:

Add support for (yet again) more RVA23U64 missing extensions. Add
support for Zimop, Zcmop, Zca, Zcf, Zcd and Zcb extensions ISA string
parsing, hwprobe and kvm support. Zce, Zcmt and Zcmp extensions have
been left out since they target microcontrollers/embedded CPUs and are
not needed by RVA23U64.

Since Zc* extensions states that C implies Zca, Zcf (if F and RV32), Zcd
(if D), this series modifies the way ISA string is parsed and now does
it in two phases. First one parses the string and the second one
validates it for the final ISA description.

* b4-shazam-merge:
  KVM: riscv: selftests: Add Zcmop extension to get-reg-list test
  RISC-V: KVM: Allow Zcmop extension for Guest/VM
  riscv: hwprobe: export Zcmop ISA extension
  riscv: add ISA extension parsing for Zcmop
  dt-bindings: riscv: add Zcmop ISA extension description
  KVM: riscv: selftests: Add some Zc* extensions to get-reg-list test
  RISC-V: KVM: Allow Zca, Zcf, Zcd and Zcb extensions for Guest/VM
  riscv: hwprobe: export Zca, Zcf, Zcd and Zcb ISA extensions
  riscv: add ISA parsing for Zca, Zcf, Zcd and Zcb
  riscv: add ISA extensions validation callback
  dt-bindings: riscv: add Zca, Zcf, Zcd and Zcb ISA extension description
  KVM: riscv: selftests: Add Zimop extension to get-reg-list test
  RISC-V: KVM: Allow Zimop extension for Guest/VM
  riscv: hwprobe: export Zimop ISA extension
  riscv: add ISA extension parsing for Zimop
  dt-bindings: riscv: add Zimop ISA extension description

Link: https://lore.kernel.org/r/20240619113529.676940-1-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:55:02 -07:00
Clément Léger
29cf9b803e
RISC-V: KVM: Allow Zcmop extension for Guest/VM
Extend the KVM ISA extension ONE_REG interface to allow KVM user space
to detect and enable Zcmop extension for Guest/VM.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240619113529.676940-16-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:59 -07:00
Clément Léger
fc078ea317
riscv: hwprobe: export Zcmop ISA extension
Export Zcmop ISA extension through hwprobe.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Evan Green <evan@rivosinc.com>
Link: https://lore.kernel.org/r/20240619113529.676940-15-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:58 -07:00
Clément Léger
164d644059
riscv: add ISA extension parsing for Zcmop
Add parsing for Zcmop ISA extension which was ratified in commit
c732a4f39a4c ("Zcmop is ratified/1.0") of the riscv-isa-manual.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240619113529.676940-14-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:57 -07:00
Clément Léger
d964e8f2ae
RISC-V: KVM: Allow Zca, Zcf, Zcd and Zcb extensions for Guest/VM
Extend the KVM ISA extension ONE_REG interface to allow KVM user space
to detect and enable Zca, Zcf, Zcd and Zcb extensions for Guest/VM.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240619113529.676940-11-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:54 -07:00
Clément Léger
0ad70db5eb
riscv: hwprobe: export Zca, Zcf, Zcd and Zcb ISA extensions
Export Zca, Zcf, Zcd and Zcb ISA extension through hwprobe.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240619113529.676940-10-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:53 -07:00
Clément Léger
ba4cd85583
riscv: add ISA parsing for Zca, Zcf, Zcd and Zcb
The Zc* standard extension for code reduction introduces new extensions.
This patch adds support for Zca, Zcf, Zcd and Zcb. Zce, Zcmt and Zcmp
are left out of this patch since they are targeting microcontrollers/
embedded CPUs instead of application processors.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240619113529.676940-9-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:52 -07:00
Clément Léger
625034abd5
riscv: add ISA extensions validation callback
Since a few extensions (Zicbom/Zicboz) already needs validation and
future ones will need it as well (Zc*) add a validate() callback to
struct riscv_isa_ext_data. This require to rework the way extensions are
parsed and split it in two phases. First phase is isa string or isa
extension list parsing and consists in enabling all the extensions in a
temporary bitmask (source isa) without any validation. The second step
"resolves" the final isa bitmap, handling potential missing dependencies.
The mechanism is quite simple and simply validate each extension
described in the source bitmap before enabling it in the resolved isa
bitmap. validate() callbacks can return either 0 for success,
-EPROBEDEFER if extension needs to be validated again at next loop. A
previous ISA bitmap is kept to avoid looping multiple times if an
extension dependencies are never satisfied until we reach a stable
state. In order to avoid any potential infinite looping, allow looping
a maximum of the number of extension we handle. Zicboz and Zicbom
extensions are modified to use this validation mechanism.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240619113529.676940-8-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:51 -07:00
Clément Léger
fb2a3d63ef
RISC-V: KVM: Allow Zimop extension for Guest/VM
Extend the KVM ISA extension ONE_REG interface to allow KVM user space
to detect and enable Zimop extension for Guest/VM.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Acked-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240619113529.676940-5-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:48 -07:00
Clément Léger
36f8960de8
riscv: hwprobe: export Zimop ISA extension
Export Zimop ISA extension through hwprobe.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240619113529.676940-4-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:47 -07:00
Clément Léger
2467c2104f
riscv: add ISA extension parsing for Zimop
Add parsing for Zimop ISA extension which was ratified in commit
58220614a5f of the riscv-isa-manual.

Signed-off-by: Clément Léger <cleger@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240619113529.676940-3-cleger@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:54:46 -07:00
Palmer Dabbelt
cc2c169e34
Merge patch "riscv: stacktrace: convert arch_stack_walk() to noinstr"
This first patch in the larger series is a fix, so I'm merging it into
fixes while the rest of the patch set is still under development.

* b4-shazam-merge:
  riscv: stacktrace: convert arch_stack_walk() to noinstr

Link: https://lore.kernel.org/r/20240613-dev-andyc-dyn-ftrace-v4-v1-0-1a538e12c01e@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:38:02 -07:00
Andy Chiu
23b2188920
riscv: stacktrace: convert arch_stack_walk() to noinstr
arch_stack_walk() is called intensively in function_graph when the
kernel is compiled with CONFIG_TRACE_IRQFLAGS. As a result, the kernel
logs a lot of arch_stack_walk and its sub-functions into the ftrace
buffer. However, these functions should not appear on the trace log
because they are part of the ftrace itself. This patch references what
arm64 does for the smae function. So it further prevent the re-enter
kprobe issue, which is also possible on riscv.

Related-to: commit 0fbcd8abf3 ("arm64: Prohibit instrumentation on arch_stack_walk()")
Fixes: 680341382d ("riscv: add CALLER_ADDRx support")
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240613-dev-andyc-dyn-ftrace-v4-v1-1-1a538e12c01e@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:37:59 -07:00
Alexandre Ghiti
edf2d546bf
riscv: patch: Flush the icache right after patching to avoid illegal insns
We cannot delay the icache flush after patching some functions as we may
have patched a function that will get called before the icache flush.

The only way to completely avoid such scenario is by flushing the icache
as soon as we patch a function. This will probably be costly as we don't
batch the icache maintenance anymore.

Fixes: 6ca445d8af ("riscv: Fix early ftrace nop patching")
Reported-by: Conor Dooley <conor.dooley@microchip.com>
Closes: https://lore.kernel.org/linux-riscv/20240613-lubricant-breath-061192a9489a@wendy/
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andy Chiu <andy.chiu@sifive.com>
Link: https://lore.kernel.org/r/20240624082141.153871-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:37:27 -07:00
Palmer Dabbelt
d4b539adc8
Merge patch series "riscv: Various text patching improvements"
Samuel Holland <samuel.holland@sifive.com> says:

Here are a few changes to minimize calls to stop_machine() and
flush_icache_*() in the various text patching functions, as well as
to simplify the code.

* b4-shazam-merge:
  riscv: Remove extra variable in patch_text_nosync()
  riscv: Use offset_in_page() in text patching functions
  riscv: Pass patch_text() the length in bytes
  riscv: Simplify text patching loops
  riscv: kprobes: Use patch_text_nosync() for insn slots
  riscv: jump_label: Simplify assembly syntax
  riscv: jump_label: Batch icache maintenance

Link: https://lore.kernel.org/r/20240327160520.791322-1-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:35 -07:00
Samuel Holland
47742484ee
riscv: Remove extra variable in patch_text_nosync()
This cast is superfluous, and is incorrect anyway if compressed
instructions may be present.

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240327160520.791322-8-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:33 -07:00
Samuel Holland
eaee548756
riscv: Use offset_in_page() in text patching functions
This is a bit easier to parse than the equivalent bit manipulation.

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240327160520.791322-7-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:32 -07:00
Samuel Holland
51781ce8f4
riscv: Pass patch_text() the length in bytes
patch_text_nosync() already handles an arbitrary length of code, so this
removes a superfluous loop and reduces the number of icache flushes.

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240327160520.791322-6-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:31 -07:00
Samuel Holland
5080ca0fe9
riscv: Simplify text patching loops
This reduces the number of variables and makes the code easier to parse.

Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240327160520.791322-5-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:30 -07:00
Samuel Holland
b1756750a3
riscv: kprobes: Use patch_text_nosync() for insn slots
These instructions are not yet visible to the rest of the system,
so there is no need to do the whole stop_machine() dance.

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240327160520.791322-4-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:29 -07:00
Samuel Holland
2aa30d19cf
riscv: jump_label: Simplify assembly syntax
The idiomatic way to write "jal zero" is "j".

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240327160520.791322-3-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:28 -07:00
Samuel Holland
652b56b184
riscv: jump_label: Batch icache maintenance
Switch to the batched version of the jump label update functions so
instruction cache maintenance is deferred until the end of the update.

Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Signed-off-by: Samuel Holland <samuel.holland@sifive.com>
Link: https://lore.kernel.org/r/20240327160520.791322-2-samuel.holland@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-26 07:36:27 -07:00
Yu-Wei Hsu
e325618349 RISC-V: KVM: Redirect AMO load/store access fault traps to guest
The KVM RISC-V does not delegate AMO load/store access fault traps to
VS-mode (hedeleg) so typically M-mode takes these traps and redirects
them back to HS-mode. However, upon returning from M-mode, the KVM
RISC-V running in HS-mode terminates VS-mode software.

The KVM RISC-V should redirect AMO load/store access fault traps back
to VS-mode and let the VS-mode trap handler determine the next steps.

Signed-off-by: Yu-Wei Hsu <betterman5240@gmail.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20240429092113.70695-1-betterman5240@gmail.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-06-26 18:37:41 +05:30
Shenlin Liang
91195a90f1 RISCV: KVM: add tracepoints for entry and exit events
Like other architectures, RISCV KVM also needs to add these event
tracepoints to count the number of times kvm guest entry/exit.

Signed-off-by: Shenlin Liang <liangshenlin@eswincomputing.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Tested-by: Atish Patra <atishp@rivosinc.com>
Link: https://lore.kernel.org/r/20240422080833.8745-2-liangshenlin@eswincomputing.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-06-26 18:37:36 +05:30
Anup Patel
3385339296 RISC-V: KVM: Use IMSIC guest files when available
Let us discover and use IMSIC guest files from the IMSIC global
config provided by the IMSIC irqchip driver.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Link: https://lore.kernel.org/r/20240411090639.237119-3-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-06-26 18:37:34 +05:30
Anup Patel
e5b088c1dc RISC-V: KVM: Share APLIC and IMSIC defines with irqchip drivers
We have common APLIC and IMSIC headers available under
include/linux/irqchip/ directory which are used by APLIC
and IMSIC irqchip drivers. Let us replace the use of
kvm_aia_*.h headers with include/linux/irqchip/riscv-*.h
headers.

Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Link: https://lore.kernel.org/r/20240411090639.237119-2-apatel@ventanamicro.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-06-26 18:37:32 +05:30
Jesse Taube
04a2aef59c
RISC-V: fix vector insn load/store width mask
RVFDQ_FL_FS_WIDTH_MASK should be 3 bits [14-12], shifted down by 12 bits.
Replace GENMASK(3, 0) with GENMASK(2, 0).

Fixes: cd05483724 ("riscv: Allocate user's vector context in the first-use trap")
Signed-off-by: Jesse Taube <jesse@rivosinc.com>
Reviewed-by: Charlie Jenkins <charlie@rivosinc.com>
Link: https://lore.kernel.org/r/20240606182800.415831-1-jesse@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-25 08:47:10 -07:00
Arnd Bergmann
295f10061a syscalls: mmap(): use unsigned offset type consistently
Most architectures that implement the old-style mmap() with byte offset
use 'unsigned long' as the type for that offset, but microblaze and
riscv have the off_t type that is shared with userspace, matching the
prototype in include/asm-generic/syscalls.h.

Make this consistent by using an unsigned argument everywhere. This
changes the behavior slightly, as the argument is shifted to a page
number, and an user input with the top bit set would result in a
negative page offset rather than a large one as we use elsewhere.

For riscv, the 32-bit sys_mmap2() definition actually used a custom
type that is different from the global declaration, but this was
missed due to an incorrect type check.

Signed-off-by: Arnd Bergmann <arnd@arndb.de>
2024-06-25 15:57:38 +02:00
Hal Feng
4ed81d9dd7 riscv: dts: starfive: jh7110: Add the core reset and jh7110 compatible for uarts
Add the core reset for uarts, which is necessary for uarts to work.

Signed-off-by: Hal Feng <hal.feng@starfivetech.com>
Link: https://lore.kernel.org/r/20240604084729.57239-4-hal.feng@starfivetech.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-24 16:09:11 +02:00
Rafael Passos
9919c5c98c bpf: remove unused parameter in bpf_jit_binary_pack_finalize
Fixes a compiler warning. the bpf_jit_binary_pack_finalize function
was taking an extra bpf_prog parameter that went unused.
This removves it and updates the callers accordingly.

Signed-off-by: Rafael Passos <rafael@rcpassos.me>
Link: https://lore.kernel.org/r/20240615022641.210320-2-rafael@rcpassos.me
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-06-20 19:50:26 -07:00
Conor Dooley
3f41368fbf riscv: dts: microchip: add an initial devicetree for the BeagleV Fire
Add an initial devicetree for the BeagleV Fire. This devicetree differs
from that in the BeagleBoard BSP as it has a different memory
configuration, however it will boot on the same FPGA images. PCI is
disabled for now, as the Linux PCI driver (and the binding) assume
which root port instance is in use. This will need to be fixed before
PCI can be enabled.

Link: https://www.beagleboard.org/boards/beaglev-fire
Co-developed-by: Jamie Gibbons <jamie.gibbons@microchip.com>
Signed-off-by: Jamie Gibbons <jamie.gibbons@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-06-19 12:36:55 +01:00
Shengyu Qu
3c1f81a1b5 riscv: dts: starfive: Set EMMC vqmmc maximum voltage to 3.3V on JH7110 boards
Currently, for JH7110 boards with EMMC slot, vqmmc voltage for EMMC is
fixed to 1.8V, while the spec needs it to be 3.3V on low speed mode and
should support switching to 1.8V when using higher speed mode. Since
there are no other peripherals using the same voltage source of EMMC's
vqmmc(ALDO4) on every board currently supported by mainline kernel,
regulator-max-microvolt of ALDO4 should be set to 3.3V.

Cc: stable@vger.kernel.org
Signed-off-by: Shengyu Qu <wiagn233@outlook.com>
Fixes: 7dafcfa79c ("riscv: dts: starfive: enable DCDC1&ALDO4 node in axp15060")
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-06-19 12:19:32 +01:00
Matthias Brugger
edbce932b1 riscv: dts: starfive: Update flash partition layout
Up to now, the describe flash partition layout has some gaps.
Use the whole flash chip by getting rid of the gaps.

Suggested-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Signed-off-by: Matthias Brugger <matthias.bgg@gmail.com>
Reviewed-by: Emil Renner Berthing <emil.renner.berthing@canonical.com>
Reviewed-by: Heinrich Schuchardt <heinrich.schuchardt@canonical.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-06-19 11:05:43 +01:00
Inochi Amaoto
c61fea676b riscv: dts: thead: th1520: Add PMU event node
T-HEAD th1520 uses standard C910 chip and its pmu is already supported
by OpenSBI.

Add the pmu event description for T-HEAD th1520 SoC.

Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Link: https://www.xrvm.com/product/xuantie/4240217381324001280?spm=xrvm.27140568.0.0.7f979b29nzIa1m
Reviewed-by: Guo Ren <guoren@kernel.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-06-19 11:05:43 +01:00
Henry Bell
2606bf583b riscv: dts: starfive: add Star64 board devicetree
The Pine64 Star64 is a development board based on the Starfive JH7110 SoC.
The board features:

- JH7110 SoC
- 4/8 GiB LPDDR4 DRAM
- AXP15060 PMIC
- 40 pin GPIO header
- 1x USB 3.0 host port
- 3x USB 2.0 host port
- 1x eMMC slot
- 1x MicroSD slot
- 1x QSPI Flash
- 2x 1Gbps Ethernet port
- 1x HDMI port
- 1x 4-lane DSI
- 1x 2-lane CSI
- 1x PCIe 2.0 x1 lane

Signed-off-by: Henry Bell <dmoo_dv@protonmail.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-06-19 11:05:43 +01:00
Haylen Chu
890182bb3d riscv: dts: sophgo: disable write-protection for milkv duo
Milkv Duo does not have a write-protect pin, so disable write protect
to prevent SDcards misdetected as read-only.

Fixes: 89a7056ed4 ("riscv: dts: sophgo: add sdcard support for milkv duo")
Signed-off-by: Haylen Chu <heylenay@outlook.com>
Link: https://lore.kernel.org/r/SEYPR01MB4221943C7B101DD2318DA0D3D7CE2@SEYPR01MB4221.apcprd01.prod.exchangelabs.com
Signed-off-by: Inochi Amaoto <inochiama@outlook.com>
Signed-off-by: Chen Wang <unicorn_wang@outlook.com>
2024-06-19 08:46:03 +08:00
David Matlack
a6816314af KVM: Introduce vcpu->wants_to_run
Introduce vcpu->wants_to_run to indicate when a vCPU is in its core run
loop, i.e. when the vCPU is running the KVM_RUN ioctl and immediate_exit
was not set.

Replace all references to vcpu->run->immediate_exit with
!vcpu->wants_to_run to avoid TOCTOU races with userspace. For example, a
malicious userspace could invoked KVM_RUN with immediate_exit=true and
then after KVM reads it to set wants_to_run=false, flip it to false.
This would result in the vCPU running in KVM_RUN with
wants_to_run=false. This wouldn't cause any real bugs today but is a
dangerous landmine.

Signed-off-by: David Matlack <dmatlack@google.com>
Link: https://lore.kernel.org/r/20240503181734.1467938-2-dmatlack@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-18 09:20:01 -07:00
Jakub Kicinski
4c7d3d79c7 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts, no adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-13 13:13:46 -07:00
Hal Feng
d8a7d89abb riscv: defconfig: Enable StarFive JH7110 drivers
Add support for StarFive JH7110 SoC and VisionFive 2 board.
- Cache
- Temperature sensor
- PMIC (AXP15060)
- Ethernet PHY (YT8531)
- Restart GPIO
- RNG
- I2C
- SPI
- Quad SPI
- PCIe
- USB & USB 2.0 PHY & PCIe 2.0/USB 3.0 PHY
- Audio (I2S / TDM / PWM-DAC)
- MIPI-CSI2 RX & D-PHY RX

Signed-off-by: Hal Feng <hal.feng@starfivetech.com>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-06-12 23:16:55 +01:00
Sean Christopherson
2a27c43140 KVM: Delete the now unused kvm_arch_sched_in()
Delete kvm_arch_sched_in() now that all implementations are nops.

Reviewed-by: Bibo Mao <maobibo@loongson.cn>
Acked-by: Kai Huang <kai.huang@intel.com>
Link: https://lore.kernel.org/r/20240522014013.1672962-5-seanjc@google.com
Signed-off-by: Sean Christopherson <seanjc@google.com>
2024-06-11 14:18:45 -07:00
Steven Rostedt (Google)
5f7fb89a11 function_graph: Everyone uses HAVE_FUNCTION_GRAPH_RET_ADDR_PTR, remove it
All architectures that implement function graph also implements
HAVE_FUNCTION_GRAPH_RET_ADDR_PTR. Remove it, as it is no longer a
differentiator.

Link: https://lore.kernel.org/linux-trace-kernel/20240611031737.982047614@goodmis.org

Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: WANG Xuerui <kernel@xen0n.name>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: "Naveen N. Rao" <naveen.n.rao@linux.ibm.com>
Cc: Paul Walmsley <paul.walmsley@sifive.com>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Christian Borntraeger <borntraeger@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Signed-off-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2024-06-11 11:18:24 -04:00
Jakub Kicinski
b1156532bc bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZmIsRAAKCRDbK58LschI
 g4SSAP0bkl6rPMn7zp1h+/l7hlvpp2aVOmasBTe8hIhAGUbluwD/TGq4sNsGgXFI
 i4tUtFRhw8pOjy2guy6526qyJvBs8wY=
 =WMhY
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-06-06

We've added 54 non-merge commits during the last 10 day(s) which contain
a total of 50 files changed, 1887 insertions(+), 527 deletions(-).

The main changes are:

1) Add a user space notification mechanism via epoll when a struct_ops
   object is getting detached/unregistered, from Kui-Feng Lee.

2) Big batch of BPF selftest refactoring for sockmap and BPF congctl
   tests, from Geliang Tang.

3) Add BTF field (type and string fields, right now) iterator support
   to libbpf instead of using existing callback-based approaches,
   from Andrii Nakryiko.

4) Extend BPF selftests for the latter with a new btf_field_iter
   selftest, from Alan Maguire.

5) Add new kfuncs for a generic, open-coded bits iterator,
   from Yafang Shao.

6) Fix BPF selftests' kallsyms_find() helper under kernels configured
   with CONFIG_LTO_CLANG_THIN, from Yonghong Song.

7) Remove a bunch of unused structs in BPF selftests,
   from David Alan Gilbert.

8) Convert test_sockmap section names into names understood by libbpf
   so it can deduce program type and attach type, from Jakub Sitnicki.

9) Extend libbpf with the ability to configure log verbosity
   via LIBBPF_LOG_LEVEL environment variable, from Mykyta Yatsenko.

10) Fix BPF selftests with regards to bpf_cookie and find_vma flakiness
    in nested VMs, from Song Liu.

11) Extend riscv32/64 JITs to introduce shift/add helpers to generate Zba
    optimization, from Xiao Wang.

12) Enable BPF programs to declare arrays and struct fields with kptr,
    bpf_rb_root, and bpf_list_head, from Kui-Feng Lee.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (54 commits)
  selftests/bpf: Drop useless arguments of do_test in bpf_tcp_ca
  selftests/bpf: Use start_test in test_dctcp in bpf_tcp_ca
  selftests/bpf: Use start_test in test_dctcp_fallback in bpf_tcp_ca
  selftests/bpf: Add start_test helper in bpf_tcp_ca
  selftests/bpf: Use connect_to_fd_opts in do_test in bpf_tcp_ca
  libbpf: Auto-attach struct_ops BPF maps in BPF skeleton
  selftests/bpf: Add btf_field_iter selftests
  selftests/bpf: Fix send_signal test with nested CONFIG_PARAVIRT
  libbpf: Remove callback-based type/string BTF field visitor helpers
  bpftool: Use BTF field iterator in btfgen
  libbpf: Make use of BTF field iterator in BTF handling code
  libbpf: Make use of BTF field iterator in BPF linker code
  libbpf: Add BTF field iterator
  selftests/bpf: Ignore .llvm.<hash> suffix in kallsyms_find()
  selftests/bpf: Fix bpf_cookie and find_vma in nested VM
  selftests/bpf: Test global bpf_list_head arrays.
  selftests/bpf: Test global bpf_rb_root arrays and fields in nested struct types.
  selftests/bpf: Test kptr arrays and kptrs in nested struct fields.
  bpf: limit the number of levels of a nested struct type.
  bpf: look into the types of the fields of a struct type recursively.
  ...
====================

Link: https://lore.kernel.org/r/20240606223146.23020-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-10 18:02:14 -07:00
Linus Torvalds
0a02756d91 RISC-V Fixes for 6.10-rc3
* Another fix to avoid allocating pages that overlap with ERR_PTR, which
   manifests on rv32.
 * A revert for the badaccess patch I incorrectly picked up an early
   version of.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmZjGDcTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiWGoD/9LuIorx3Oxtq8oWq42DpMHQDKIJU/p
 PlopOk0acCucfg13ZfAQDJ0H8PN64SLJ/Zhzh/+Ge+6s1FRD1b5pqsT0ZKUytsrk
 pdWVha8QAQYCm3YekLGvQCHHN1Lh3/dilOvZEJZN1l+ltnRFh+tfurB+E/Cre/mb
 pRMGDtAASXGUAkuEZqKfxEe7zQ4xRnyVx0DtcFl1xVjNP11ywwwbnkf7GyrhmDHT
 SITcjF3NaoqQ4zgEwD8MflL6k0fwhwECw3RBvwT23jfHX56ttPJRw0bn9l8UlY25
 HdvBw45bcIbva0COIMIBnYjpokj5wIOnaOhJSHbHllkVFTpBcaiO+CgDj440sAlJ
 4vvl7ll8Fn/DCEjzQgvePndjzWPyxEGkAU6b8f68wj0dfRU/FW16GE3nO/mYLjjx
 RWob8+/qh6kg51KsaCdDNHaASoeVG6ZYokIOYecc30rDZv2fjQv5ly81k6UpKcCu
 0v9el1vX7KaM6GYQkIhyJbnlCCTGLfKJWcWU/Sup7pw7S2HmAkwV3LnuUQwkVXrG
 Qc6GWTbD7YrfVVQF+yX/PrpmuhQWk6ZUZjoA20jmkoeN4IDUG5l1j6xy6I0dDp6T
 OYSYdQNuGfic2X6wO73iImFJTdleuqFVW5GjiD2LkZXziV5rMGpgaXVGzhwnltk8
 A+B53s9STVmbyQ==
 =tO1B
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull RISC-V fixes from Palmer Dabbelt:

 - Another fix to avoid allocating pages that overlap with ERR_PTR,
   which manifests on rv32

 - A revert for the badaccess patch I incorrectly picked up an early
   version of

* tag 'riscv-for-linus-6.10-rc3' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  Revert "riscv: mm: accelerate pagefault when badaccess"
  riscv: fix overlap of allocated page and PTR_ERR
2024-06-07 14:47:38 -07:00
Jakub Kicinski
62b5bf58b9 Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

No conflicts.

Adjacent changes:

drivers/net/ethernet/pensando/ionic/ionic_txrx.c
  d9c0420999 ("ionic: Mark error paths in the data path as unlikely")
  491aee894a ("ionic: fix kernel panic in XDP_TX action")

net/ipv6/ip6_fib.c
  b4cb4a1391 ("net: use unrcu_pointer() helper")
  b01e1c0307 ("ipv6: fix possible race in __fib6_drop_pcpu_from()")

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-06-06 12:06:56 -07:00
Paolo Bonzini
b50788f7cd KVM/riscv fixes for 6.10, take #1
- No need to use mask when hart-index-bits is 0
 - Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext()
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEZdn75s5e6LHDQ+f/rUjsVaLHLAcFAmZZ/j0ACgkQrUjsVaLH
 LAcLyg//Tz9gFc7kTPR9kBLhDgTytS0WuBDnh8P09bGAv/36+AJZ8ZbXDTvh44Ul
 w1sLYjDVW08kCWxkTg9VPVxv467O3E7F1bduXMTBI2EBZxPul+T+p8Rv9iunAGD/
 945nkCpAMOl6xiaIFlIISvOH1aviTp/0YcmFGaE3AcbBb7kEaZ2MONuvffedSGV4
 5/MkAKtHpKj7Rekjrnpz0Nl+KVqYqCMS8PAvJNKQgG/e7MC2IHfRX+2GYnBZH1ur
 cENKUotSbmrzoeA9nmB8Yl74qgW/OA7fUU5CBb/x7JYl5oZbQs203z8ve9XIqcwk
 HtQEOp0G/EL3rmQanZl47zWG0ySI5pXRglc7DC8lnyiKjd9+Ut7VIqAVCd4H4ZC5
 5c413zPxrCtEEK+tnkrvnvK+Hks5ZtqXMfpFJQSf0twXt5JpBlgzVL6Suf48/9sA
 IRkkt4kq/XVq4ZErkeHE/pH1kwVqYMfjUHlc0L0dChcA2lNARQbfO+P215xU/H9d
 1ym02fL6P5ZCjzW0t+IfdvK6u7EbI9nAyBw/+8cW8ppPoJBP5AaiBKfWz20XcasE
 3Vktr7njqJf/eaI9oyUnTlkfaktb4Ll8WPnifZgsc9zMJQtCVw1TK4Q1bMtDJ1zX
 GcQlGlu7i1vrmkUUXPE37aQehuUBYk7rmX6B+CW56ILUm62hS0A=
 =oPq/
 -----END PGP SIGNATURE-----

Merge tag 'kvm-riscv-fixes-6.10-1' of https://github.com/kvm-riscv/linux into HEAD

KVM/riscv fixes for 6.10, take #1

- No need to use mask when hart-index-bits is 0
- Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext()
2024-06-03 13:18:18 -04:00
Xiao Wang
96a27ee76f riscv, bpf: Introduce shift add helper with Zba optimization
Zba extension is very useful for generating addresses that index into array
of basic data types. This patch introduces sh2add and sh3add helpers for
RV32 and RV64 respectively, to accelerate addressing for array of unsigned
long data.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20240524075543.4050464-3-xiao.w.wang@intel.com
2024-06-03 16:45:23 +02:00
Palmer Dabbelt
e2c79b4c5c
Revert "riscv: mm: accelerate pagefault when badaccess"
I accidentally picked up an earlier version of this patch, which had
already landed via mm.  The patch  I picked up contains a bug, which I
kept as I thought it was a fix.  So let's just revert it.

This reverts commit 4c6c002042.

Fixes: 4c6c002042 ("riscv: mm: accelerate pagefault when badaccess")
Reviewed-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240530164451.21336-1-palmer@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-03 07:41:13 -07:00
Nam Cao
994af1825a
riscv: fix overlap of allocated page and PTR_ERR
On riscv32, it is possible for the last page in virtual address space
(0xfffff000) to be allocated. This page overlaps with PTR_ERR, so that
shouldn't happen.

There is already some code to ensure memblock won't allocate the last page.
However, buddy allocator is left unchecked.

Fix this by reserving physical memory that would be mapped at virtual
addresses greater than 0xfffff000.

Reported-by: Björn Töpel <bjorn@kernel.org>
Closes: https://lore.kernel.org/linux-riscv/878r1ibpdn.fsf@all.your.base.are.belong.to.us
Fixes: 76d2a0493a ("RISC-V: Init and Halt Code")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: <stable@vger.kernel.org>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Reviewed-by: Mike Rapoport (IBM) <rppt@kernel.org>
Link: https://lore.kernel.org/r/20240425115201.3044202-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-06-03 07:41:09 -07:00
Jakub Kicinski
e19de2064f Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net
Cross-merge networking fixes after downstream PR.

Conflicts:

drivers/net/ethernet/ti/icssg/icssg_classifier.c
  abd5576b9c ("net: ti: icssg-prueth: Add support for ICSSG switch firmware")
  56a5cf538c ("net: ti: icssg-prueth: Fix start counter for ft1 filter")
https://lore.kernel.org/all/20240531123822.3bb7eadf@canb.auug.org.au/

No other adjacent changes.

Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-31 14:10:28 -07:00
Quan Zhou
c66f3b40b1 RISC-V: KVM: Fix incorrect reg_subtype labels in kvm_riscv_vcpu_set_reg_isa_ext function
In the function kvm_riscv_vcpu_set_reg_isa_ext, the original code
used incorrect reg_subtype labels KVM_REG_RISCV_SBI_MULTI_EN/DIS.
These have been corrected to KVM_REG_RISCV_ISA_MULTI_EN/DIS respectively.
Although they are numerically equivalent, the actual processing
will not result in errors, but it may lead to ambiguous code semantics.

Fixes: 613029442a ("RISC-V: KVM: Extend ONE_REG to enable/disable multiple ISA extensions")
Signed-off-by: Quan Zhou <zhouquan@iscas.ac.cn>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/ff1c6771a67d660db94372ac9aaa40f51e5e0090.1716429371.git.zhouquan@iscas.ac.cn
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-05-31 10:40:39 +05:30
Yong-Xuan Wang
2d707b4e37 RISC-V: KVM: No need to use mask when hart-index-bit is 0
When the maximum hart number within groups is 1, hart-index-bit is set to
0. Consequently, there is no need to restore the hart ID from IMSIC
addresses and hart-index-bit settings. Currently, QEMU and kvmtool do not
pass correct hart-index-bit values when the maximum hart number is a
power of 2, thereby avoiding this issue. Corresponding patches for QEMU
and kvmtool will also be dispatched.

Fixes: 89d01306e3 ("RISC-V: KVM: Implement device interface for AIA irqchip")
Signed-off-by: Yong-Xuan Wang <yongxuan.wang@sifive.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Link: https://lore.kernel.org/r/20240415064905.25184-1-yongxuan.wang@sifive.com
Signed-off-by: Anup Patel <anup@brainfault.org>
2024-05-31 09:47:08 +05:30
Andy Chiu
ac295b6742
riscv: vector: adjust minimum Vector requirement to ZVE32X
Make has_vector() to check for ZVE32X. Every in-kernel usage of V that
requires a more complicate version of V must then call out explicitly.

Also, change riscv_v_first_use_handler(), and boot code that calls
riscv_v_setup_vsize() to accept ZVE32X.

Most kernel/user interfaces requires minimum of ZVE32X. Thus, programs
compiled and run with ZVE32X should be supported by the kernel on most
aspects. This includes context-switch, signal, ptrace, prctl, and
hwprobe.

One exception is that ELF_HWCAP returns 'V' only if full V is supported
on the platform. This means that the system without a full V must not
rely on ELF_HWCAP to tell whether it is allowable to execute Vector
without first invoking a prctl() check.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Acked-by: Joel Granados <j.granados@samsung.com>
Link: https://lore.kernel.org/r/20240510-zve-detection-v5-7-0711bdd26c12@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 14:33:10 -07:00
Andy Chiu
de8f8282a9
riscv: hwprobe: add zve Vector subextensions into hwprobe interface
The following Vector subextensions for "embedded" platforms are added
into RISCV_HWPROBE_KEY_IMA_EXT_0:
 - ZVE32X
 - ZVE32F
 - ZVE64X
 - ZVE64F
 - ZVE64D

Extensions ending with an X indicates that the platform doesn't have a
vector FPU.
Extensions ending with F/D mean that whether single (F) or double (D)
precision vector operation is supported.
The number 32 or 64 follows from ZVE tells the maximum element length.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Clément Léger <cleger@rivosinc.com>
Link: https://lore.kernel.org/r/20240510-zve-detection-v5-6-0711bdd26c12@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 14:33:09 -07:00
Andy Chiu
1e7483542b
riscv: cpufeature: add zve32[xf] and zve64[xfd] isa detection
Multiple Vector subextensions are added. Also, the patch takes care of
the dependencies of Vector subextensions by macro expansions. So, if
some "embedded" platform only reports "zve64f" on the ISA string, the
parser is able to expand it to zve32x zve32f zve64x and zve64f.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240510-zve-detection-v5-5-0711bdd26c12@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 14:33:08 -07:00
Andy Chiu
98a5700dfa
riscv: cpufeature: call match_isa_ext() for single-letter extensions
Single-letter extensions may also imply multiple subextensions. For
example, Vector extension implies zve64d, and zve64d implies zve64f.

Extension parsing for "riscv,isa-extensions" has the ability to resolve
the dependency by calling match_isa_ext(). This patch makes deprecated
parser call the same function for single letter extensions.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240510-zve-detection-v5-3-0711bdd26c12@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 14:33:06 -07:00
Andy Chiu
38a94c4666
riscv: smp: fail booting up smp if inconsistent vlen is detected
Currently we only support Vector for SMP platforms, that is, all SMP
cores have the same vlenb. If we happen to detect a mismatching vlen, it
is better to just fail bootting it up to prevent further race/scheduling
issues.

Also, move .Lsecondary_park forward and chage `tail smp_callin` into a
regular call in the early assembly. So a core would be parked right
after a return from smp_callin. Note that a successful smp_callin
does not return.

Fixes: 7017858eb2 ("riscv: Introduce riscv_v_vsize to record size of Vector context")
Reported-by: Conor Dooley <conor.dooley@microchip.com>
Closes: https://lore.kernel.org/linux-riscv/20240228-vicinity-cornstalk-4b8eb5fe5730@spud/
Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Yunhui Cui <cuiyunhui@bytedance.com>
Link: https://lore.kernel.org/r/20240510-zve-detection-v5-2-0711bdd26c12@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 14:33:05 -07:00
Andy Chiu
77afe3e514
riscv: vector: add a comment when calling riscv_setup_vsize()
The function would fail when it detects the calling hart's vlen doesn't
match the first one's. The boot hart is the first hart calling this
function during riscv_fill_hwcap, so it is impossible to fail here. Add
a comment about this behavior.

Signed-off-by: Andy Chiu <andy.chiu@sifive.com>
Reviewed-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20240510-zve-detection-v5-1-0711bdd26c12@sifive.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 14:33:04 -07:00
Alexandre Ghiti
1d84afaf02
riscv: Fix fully ordered LR/SC xchg[8|16]() implementations
The fully ordered versions of xchg[8|16]() using LR/SC lack the
necessary memory barriers to guarantee the order.

Fix this by matching what is already implemented in the fully ordered
versions of cmpxchg() using LR/SC.

Suggested-by: Andrea Parri <parri.andrea@gmail.com>
Reported-by: Andrea Parri <parri.andrea@gmail.com>
Closes: https://lore.kernel.org/linux-riscv/ZlYbupL5XgzgA0MX@andrea/T/#u
Fixes: a8ed2b7a2c ("riscv/cmpxchg: Implement xchg for variables of size 1 and 2")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reviewed-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lore.kernel.org/r/20240530145546.394248-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 09:43:14 -07:00
Nam Cao
7bed516174
riscv: enable HAVE_ARCH_HUGE_VMAP for XIP kernel
HAVE_ARCH_HUGE_VMAP also works on XIP kernel, so remove its dependency on
!XIP_KERNEL.

This also fixes a boot problem for XIP kernel introduced by the commit in
"Fixes:". This commit used huge page mapping for vmemmap, but huge page
vmap was not enabled for XIP kernel.

Fixes: ff172d4818 ("riscv: Use hugepage mappings for vmemmap")
Signed-off-by: Nam Cao <namcao@linutronix.de>
Cc: <stable@vger.kernel.org>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240526110104.470429-1-namcao@linutronix.de
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 09:42:52 -07:00
Sergey Matyukevich
a638b0461b
riscv: prevent pt_regs corruption for secondary idle threads
Top of the kernel thread stack should be reserved for pt_regs. However
this is not the case for the idle threads of the secondary boot harts.
Their stacks overlap with their pt_regs, so both may get corrupted.

Similar issue has been fixed for the primary hart, see c7cdd96eca
("riscv: prevent stack corruption by reserving task_pt_regs(p) early").
However that fix was not propagated to the secondary harts. The problem
has been noticed in some CPU hotplug tests with V enabled. The function
smp_callin stored several registers on stack, corrupting top of pt_regs
structure including status field. As a result, kernel attempted to save
or restore inexistent V context.

Fixes: 9a2451f186 ("RISC-V: Avoid using per cpu array for ordered booting")
Fixes: 2875fe0561 ("RISC-V: Add cpu_ops and modify default booting method")
Signed-off-by: Sergey Matyukevich <sergey.matyukevich@syntacore.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240523084327.2013211-1-geomatsi@gmail.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-30 09:42:51 -07:00
Jakub Kicinski
4b3529edbb bpf-next-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZlWtmQAKCRDbK58LschI
 g0TUAQDT76jx7Rq1DShCtZ3eqiBMNkYczK8b+GqNsSG8YGduaAEA1jn/GN+H65Rh
 atQZ/pYAfLZflMV04+XE0GyBr5q1uQg=
 =NczG
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next

Daniel Borkmann says:

====================
pull-request: bpf-next 2024-05-28

We've added 23 non-merge commits during the last 11 day(s) which contain
a total of 45 files changed, 696 insertions(+), 277 deletions(-).

The main changes are:

1) Rename skb's mono_delivery_time to tstamp_type for extensibility
   and add SKB_CLOCK_TAI type support to bpf_skb_set_tstamp(),
   from Abhishek Chauhan.

2) Add netfilter CT zone ID and direction to bpf_ct_opts so that arbitrary
   CT zones can be used from XDP/tc BPF netfilter CT helper functions,
   from Brad Cowie.

3) Several tweaks to the instruction-set.rst IETF doc to address
   the Last Call review comments, from Dave Thaler.

4) Small batch of riscv64 BPF JIT optimizations in order to emit more
   compressed instructions to the JITed image for better icache efficiency,
   from Xiao Wang.

5) Sort bpftool C dump output from BTF, aiming to simplify vmlinux.h
   diffing and forcing more natural type definitions ordering,
   from Mykyta Yatsenko.

6) Use DEV_STATS_INC() macro in BPF redirect helpers to silence
   a syzbot/KCSAN race report for the tx_errors counter,
   from Jiang Yunshui.

7) Un-constify bpf_func_info in bpftool to fix compilation with LLVM 17+
   which started treating const structs as constants and thus breaking
   full BTF program name resolution, from Ivan Babrou.

8) Fix up BPF program numbers in test_sockmap selftest in order to reduce
   some of the test-internal array sizes, from Geliang Tang.

9) Small cleanup in Makefile.btf script to use test-ge check for v1.25-only
   pahole, from Alan Maguire.

10) Fix bpftool's make dependencies for vmlinux.h in order to avoid needless
    rebuilds in some corner cases, from Artem Savkov.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next: (23 commits)
  bpf, net: Use DEV_STAT_INC()
  bpf, docs: Fix instruction.rst indentation
  bpf, docs: Clarify call local offset
  bpf, docs: Add table captions
  bpf, docs: clarify sign extension of 64-bit use of 32-bit imm
  bpf, docs: Use RFC 2119 language for ISA requirements
  bpf, docs: Move sentence about returning R0 to abi.rst
  bpf: constify member bpf_sysctl_kern:: Table
  riscv, bpf: Try RVC for reg move within BPF_CMPXCHG JIT
  riscv, bpf: Use STACK_ALIGN macro for size rounding up
  riscv, bpf: Optimize zextw insn with Zba extension
  selftests/bpf: Handle forwarding of UDP CLOCK_TAI packets
  net: Add additional bit to support clockid_t timestamp type
  net: Rename mono_delivery_time to tstamp_type for scalabilty
  selftests/bpf: Update tests for new ct zone opts for nf_conntrack kfuncs
  net: netfilter: Make ct zone opts configurable for bpf ct helpers
  selftests/bpf: Fix prog numbers in test_sockmap
  bpf: Remove unused variable "prev_state"
  bpftool: Un-const bpf_func_info to fix it for llvm 17 and newer
  bpf: Fix order of args in call to bpf_map_kvcalloc
  ...
====================

Link: https://lore.kernel.org/r/20240528105924.30905-1-daniel@iogearbox.net
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-05-28 07:27:29 -07:00
Geert Uytterhoeven
2c917b55d6 riscv: dts: canaan: Disable I/O devices unless used
It is considered good practice to disable on-SoC devices providing
external I/O in the SoC-specific .dtsi, and enable them explicitly in
the board-specific DTS files when actually wired-up and used.

Hence:
  - Set the status of I/O devices in k210.dtsi to "disabled",
  - Override the status of used I/O devices in board-specific DTS files
    to "okay",
  - Drop unneeded status overrides in board DTS-specific files for the
    always-enabled pin controller.

On e.g. MAiXBiT, this gets rid of an error message when probing the
unused slave-only spi2 controller:

    dw_spi_mmio 50240000.spi: error -22: problem registering spi host
    dw_spi_mmio 50240000.spi: probe with driver dw_spi_mmio failed with error -22

which is seen since commit 98d75b9ef2 ("spi: dw: Drop default
number of CS setting").

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-05-28 12:25:54 +01:00
Geert Uytterhoeven
9235784cb6 riscv: dts: canaan: Clean up serial aliases
The SoC-specific k210.dtsi declares aliases for all four serial ports.
However, none of the board-specific DTS files configure pin control for
any but the first serial port, so the last three ports are not usable.

Move the aliases node from the SoC-specific k210.dtsi to the
board-specific DTS files, as these are really board-specific, and retain
the sole port that is usable.

Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
2024-05-28 12:25:54 +01:00
Linus Torvalds
f1f9984fdc RISC-V Patches for the 6.10 Merge Window, Part 2
* The compression format used for boot images is now configurable at
   build time, and these formats are shown in `make help`.
 * access_ok() has been optimized.
 * A pair of performance bugs have been fixed in the uaccess handlers.
 * Various fixes and cleanups, including one for the IMSIC build failure
   and one for the early-boot ftrace illegal NOPs bug.
 -----BEGIN PGP SIGNATURE-----
 
 iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmZQtRwTHHBhbG1lckBk
 YWJiZWx0LmNvbQAKCRAuExnzX7sYiWPBD/0UitwMg88m6urvMd0Pfvwwbu/OnGqW
 TZT8C55iJi/e5f9K4mBrSyjATI8z/MblD+Zz0adX8ygavS4JuQ7DoWwb1yTT3pww
 +z74FkWeJuiar+HfbhQ602CfMrnzvWjnyJ3URemqy5pIBKyvD9gGkDJDZwf8hJTk
 Vqh5qVtnBqFBO9kWpIx+/pLCfpyHVNkhWr1AzKfoqQ1WPIpZ/o0IGdvS88rL+EBR
 QOXiwVhEsRfC+LT6Jhn8l2bGp7PaSRVOid19OxNsJKpAhpL6AOscaafclVrLBuTd
 gkys0rT2dHdoWTAkPHQpvlOI6OmGTgopxo5pUKJHS8J9VRoBun25zC1FGBF8uyVd
 05CabWPnh7olNsRge9XiNj3x8PXjGVi7X7wUbRgOBG5aDc6TbKdxu37J0tXe0M7a
 Q74ctQvk8Nk6bQWirgTNlfJJHzL5pJbKc9VwY5uGX4qTmH+yEvCIt45ZXgXOuS/F
 eqijStkkdXUDnkMdcpaZJvXP80rHcgfP8bqevvPymRli8ER9zj9aXJQ3rmCUcPz+
 EtbyS+vOEN31wNTA1EQlfIRxfvr22x7r70DDdRwmhuD1W1tgfblm+R0Cq76I5rnJ
 VSgXKq1b4mY0eautqXEnPGyqb7H8iJIq7AoyfbzzWN+4u6yVEUvpDKueeksy+fFt
 sGNtjWqGhWyKXg==
 =/Qtt
 -----END PGP SIGNATURE-----

Merge tag 'riscv-for-linus-6.10-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux

Pull more RISC-V updates from Palmer Dabbelt:

 - The compression format used for boot images is now configurable at
   build time, and these formats are shown in `make help`

 - access_ok() has been optimized

 - A pair of performance bugs have been fixed in the uaccess handlers

 - Various fixes and cleanups, including one for the IMSIC build failure
   and one for the early-boot ftrace illegal NOPs bug

* tag 'riscv-for-linus-6.10-mw2' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
  riscv: Fix early ftrace nop patching
  irqchip: riscv-imsic: Fixup riscv_ipi_set_virq_range() conflict
  riscv: selftests: Add signal handling vector tests
  riscv: mm: accelerate pagefault when badaccess
  riscv: uaccess: Relax the threshold for fast path
  riscv: uaccess: Allow the last potential unrolled copy
  riscv: typo in comment for get_f64_reg
  Use bool value in set_cpu_online()
  riscv: selftests: Add hwprobe binaries to .gitignore
  riscv: stacktrace: fixed walk_stackframe()
  ftrace: riscv: move from REGS to ARGS
  riscv: do not select MODULE_SECTIONS by default
  riscv: show help string for riscv-specific targets
  riscv: make image compression configurable
  riscv: cpufeature: Fix extension subset checking
  riscv: cpufeature: Fix thead vector hwcap removal
  riscv: rewrite __kernel_map_pages() to fix sleeping in invalid context
  riscv: force PAGE_SIZE linear mapping if debug_pagealloc is enabled
  riscv: Define TASK_SIZE_MAX for __access_ok()
  riscv: Remove PGDIR_SIZE_L3 and TASK_SIZE_MIN
2024-05-24 10:46:35 -07:00
Xiao Wang
99fa63d9ca riscv, bpf: Try RVC for reg move within BPF_CMPXCHG JIT
We could try to emit compressed insn for reg move operation during CMPXCHG
JIT, the instruction compression has no impact on the jump offsets of
following forward and backward jump instructions.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Acked-by: Björn Töpel <bjorn@kernel.org>
Link: https://lore.kernel.org/bpf/20240519050507.2217791-1-xiao.w.wang@intel.com
2024-05-24 17:40:33 +02:00
Xiao Wang
e944fc8152 riscv, bpf: Use STACK_ALIGN macro for size rounding up
Use the macro STACK_ALIGN that is defined in asm/processor.h for stack size
rounding up, just like bpf_jit_comp32.c does.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/bpf/20240523031835.3977713-1-xiao.w.wang@intel.com
2024-05-24 17:16:56 +02:00
Xiao Wang
c12603e76e riscv, bpf: Optimize zextw insn with Zba extension
The Zba extension provides add.uw insn which can be used to implement
zext.w with rs2 set as ZERO.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
Tested-by: Pu Lehui <pulehui@huawei.com>
Reviewed-by: Andrew Jones <ajones@ventanamicro.com>
Reviewed-by: Pu Lehui <pulehui@huawei.com>
Link: https://lore.kernel.org/bpf/20240516090430.493122-1-xiao.w.wang@intel.com
2024-05-24 16:53:12 +02:00
Alexandre Ghiti
6ca445d8af
riscv: Fix early ftrace nop patching
Commit c97bf62996 ("riscv: Fix text patching when IPI are used")
converted ftrace_make_nop() to use patch_insn_write() which does not
emit any icache flush relying entirely on __ftrace_modify_code() to do
that.

But we missed that ftrace_make_nop() was called very early directly when
converting mcount calls into nops (actually on riscv it converts 2B nops
emitted by the compiler into 4B nops).

This caused crashes on multiple HW as reported by Conor and Björn since
the booting core could have half-patched instructions in its icache
which would trigger an illegal instruction trap: fix this by emitting a
local flush icache when early patching nops.

Fixes: c97bf62996 ("riscv: Fix text patching when IPI are used")
Signed-off-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Reported-by: Conor Dooley <conor.dooley@microchip.com>
Tested-by: Conor Dooley <conor.dooley@microchip.com>
Reviewed-by: Björn Töpel <bjorn@rivosinc.com>
Tested-by: Björn Töpel <bjorn@rivosinc.com>
Link: https://lore.kernel.org/r/20240523115134.70380-1-alexghiti@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-23 08:22:17 -07:00
Linus Torvalds
c760b3725e - A series ("kbuild: enable more warnings by default") from Arnd
Bergmann which enables a number of additional build-time warnings.  We
   fixed all the fallout which we could find, there may still be a few
   stragglers.
 
 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API".  This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer AMD
   GPUs on RISC-V.
 
 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".
 
 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTTMBEPP41GrTpTJgfdBJ7gKXxAjgUCZk6OSAAKCRDdBJ7gKXxA
 jpTGAP9hQaZ+g7CO38hKQAtEI8rwcZJtvUAP84pZEGMjYMGLxQD/S8z1o7UHx61j
 DUbnunbOkU/UcPx3Fs/gp4KcJARMEgs=
 =EPi9
 -----END PGP SIGNATURE-----

Merge tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm

Pull more non-mm updates from Andrew Morton:

 - A series ("kbuild: enable more warnings by default") from Arnd
   Bergmann which enables a number of additional build-time warnings. We
   fixed all the fallout which we could find, there may still be a few
   stragglers.

 - Samuel Holland has developed the series "Unified cross-architecture
   kernel-mode FPU API". This does a lot of consolidation of
   per-architecture kernel-mode FPU usage and enables the use of newer
   AMD GPUs on RISC-V.

 - Tao Su has fixed some selftests build warnings in the series
   "Selftests: Fix compilation warnings due to missing _GNU_SOURCE
   definition".

 - This pull also includes a nilfs2 fixup from Ryusuke Konishi.

* tag 'mm-nonmm-stable-2024-05-22-17-30' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (23 commits)
  nilfs2: make block erasure safe in nilfs_finish_roll_forward()
  selftests/harness: use 1024 in place of LINE_MAX
  Revert "selftests/harness: remove use of LINE_MAX"
  selftests/fpu: allow building on other architectures
  selftests/fpu: move FP code to a separate translation unit
  drm/amd/display: use ARCH_HAS_KERNEL_FPU_SUPPORT
  drm/amd/display: only use hard-float, not altivec on powerpc
  riscv: add support for kernel-mode FPU
  x86: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  powerpc: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  LoongArch: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  lib/raid6: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  arm64: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  ARM: crypto: use CC_FLAGS_FPU for NEON CFLAGS
  ARM: implement ARCH_HAS_KERNEL_FPU_SUPPORT
  arch: add ARCH_HAS_KERNEL_FPU_SUPPORT
  x86/fpu: fix asm/fpu/types.h include guard
  kbuild: enable -Wcast-function-type-strict unconditionally
  kbuild: enable -Wformat-truncation on clang
  ...
2024-05-22 18:59:29 -07:00
Palmer Dabbelt
c6c901b7d9
Merge patch series "riscv: Extension parsing fixes"
Charlie Jenkins <charlie@rivosinc.com> says:

This series contains two minor fixes for the extension parsing in
cpufeature.c.

Some T-Head boards without vector 1.0 support report "v" in the isa
string in their DT which will cause the kernel to run vector code. The
code to blacklist "v" from these boards was doing so by using
riscv_cached_mvendorid() which has not been populated at the time of
extension parsing. This fix instead greedily reads the mvendorid CSR of
the boot hart to determine if the cpu is from T-Head.

The other fix is for an incorrect indexing bug. riscv extensions
sometimes imply other extensions. When adding these "subset" extensions
to the hardware capabilities array, they need to be checked if they are
valid. The current code only checks if the extension that is including
other extensions is valid and not the subset extensions.

These patches were previously included in:
https://lore.kernel.org/lkml/20240420-dev-charlie-support_thead_vector_6_9-v3-0-67cff4271d1d@rivosinc.com/

* b4-shazam-merge:
  riscv: cpufeature: Fix extension subset checking
  riscv: cpufeature: Fix thead vector hwcap removal

Link: https://lore.kernel.org/r/20240502-cpufeature_fixes-v4-0-b3d1a088722d@rivosinc.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:58 -07:00
Kefeng Wang
4c6c002042
riscv: mm: accelerate pagefault when badaccess
The access_error() of vma already checked under per-VMA lock, if it
is a bad access, directly handle error, no need to retry with mmap_lock
again. Since the page faut is handled under per-VMA lock, count it as
a vma lock event with VMA_LOCK_SUCCESS.

Reviewed-by: Suren Baghdasaryan <surenb@google.com>
Signed-off-by: Kefeng Wang <wangkefeng.wang@huawei.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Tested-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240403083805.1818160-6-wangkefeng.wang@huawei.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:56 -07:00
Xiao Wang
9850e73e82
riscv: uaccess: Relax the threshold for fast path
The bytes copy for unaligned head would cover at most SZREG-1 bytes, so
it's better to set the threshold as >= (SZREG-1 + word_copy stride size)
which equals to 9*SZREG-1.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240313091929.4029960-1-xiao.w.wang@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:55 -07:00
Xiao Wang
f1905946be
riscv: uaccess: Allow the last potential unrolled copy
When the dst buffer pointer points to the last accessible aligned addr, we
could still run another iteration of unrolled copy.

Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com>
Link: https://lore.kernel.org/r/20240313103334.4036554-1-xiao.w.wang@intel.com
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:54 -07:00
Xingyou Chen
7e6eae24da
riscv: typo in comment for get_f64_reg
Signed-off-by: Xingyou Chen <rockrush@rockwork.org>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Link: https://lore.kernel.org/r/20240317055556.9449-1-rockrush@rockwork.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
2024-05-22 16:12:53 -07:00