-----BEGIN PGP SIGNATURE-----
iQFSBAABCAA8FiEEq68RxlopcLEwq+PEeb4+QwBBGIYFAmJu9FYeHHRvcnZhbGRz
QGxpbnV4LWZvdW5kYXRpb24ub3JnAAoJEHm+PkMAQRiGAyEH/16xtJSpLmLwrQzG
o+4ToQxSQ+/9UHyu0RTEvHg2THm9/8emtIuYyc/5FgdoWctcSa3AaDcveWmuWmkS
KYcdhfJsaEqjNHS3OPYXN84fmo9Hel7263shu5+IYmP/sN0DfQp6UWTryX1q4B3Q
4Pdutkuq63Uwd8nBZ5LXQBumaBrmkkuMgWEdT4+6FOo1mPzwdIGBxCuz1UsNNl5k
chLWxkQfe2eqgWbYJrgCQfrVdORXVtoU2fGilZUNrHRVGkkldXkkz5clJfapyZD3
odmZCEbrE4GPKgZwCmDERMfD1hzhZDtYKiHfOQ506szH5ykJjPBcOjHed7dA60eB
J3+wdek=
=39Ca
-----END PGP SIGNATURE-----
Merge tag 'v5.18-rc5' into devel
Merge in Linux 5.18-rc5 since new code to the STM32 driver
depend in a non-trivial way on the fixes merged in -rc5.
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
* Take care of faults occuring between the PARange and
IPA range by injecting an exception
* Fix S2 faults taken from a host EL0 in protected mode
* Work around Oops caused by a PMU access from a 32bit
guest when PMU has been created. This is a temporary
bodge until we fix it for good.
x86:
* Fix potential races when walking host page table
* Fix shadow page table leak when KVM runs nested
* Work around bug in userspace when KVM synthesizes leaf
0x80000021 on older (pre-EPYC) or Intel processors
Generic (but affects only RISC-V):
* Fix bad user ABI for KVM_EXIT_SYSTEM_EVENT
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJuxI4UHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroNjfQf/X4Rn6+sTkXRS0UHWEu+q9FjJ+mIx
ZUWdbncf0brUB1RPAFfKaiQHo0t2Req+iTlpqZL0nVQ4myNUelHYube/sZdK/aBR
WOjKZE0hugGyMH3js2bsTdgzbcphThyYAX97qGZNb7tsPGhBiw7c98KhjxlieJab
D8LMNtM3uzPDxg422GfOm8ge2VbpySS5oRoGHfbD+4FiLYlXoCYfZuzlFwFFIGxw
uHm5zzfX5jshayFpFYVSJHtARXlpwJWKz9yl63QjHrhVitW4m5j4re3aNfboL6Pd
F5Z9K+DKhJLAH5cqmgiPPe2CGMvmRwKrN3F9MqV91xDPBT8J4rrowEeboQ==
=SwSU
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"ARM:
- Take care of faults occuring between the PARange and IPA range by
injecting an exception
- Fix S2 faults taken from a host EL0 in protected mode
- Work around Oops caused by a PMU access from a 32bit guest when PMU
has been created. This is a temporary bodge until we fix it for
good.
x86:
- Fix potential races when walking host page table
- Fix shadow page table leak when KVM runs nested
- Work around bug in userspace when KVM synthesizes leaf 0x80000021
on older (pre-EPYC) or Intel processors
Generic (but affects only RISC-V):
- Fix bad user ABI for KVM_EXIT_SYSTEM_EVENT"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
KVM: x86: work around QEMU issue with synthetic CPUID leaves
Revert "x86/mm: Introduce lookup_address_in_mm()"
KVM: x86/mmu: fix potential races when walking host page table
KVM: fix bad user ABI for KVM_EXIT_SYSTEM_EVENT
KVM: x86/mmu: Do not create SPTEs for GFNs that exceed host.MAXPHYADDR
KVM: arm64: Inject exception on out-of-IPA-range translation fault
KVM/arm64: Don't emulate a PMU for 32-bit guests if feature not set
KVM: arm64: Handle host stage-2 faults from 32-bit EL0
solely controlled by the hypervisor
- A build fix to make the function prototype (__warn()) as visible as
the definition itself
- A bunch of objtool annotation fixes which have accumulated over time
- An ORC unwinder fix to handle bad input gracefully
- Well, we thought the microcode gets loaded in time in order to restore
the microcode-emulated MSRs but we thought wrong. So there's a fix for
that to have the ordering done properly
- Add new Intel model numbers
- A spelling fix
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmJucwMACgkQEsHwGGHe
VUpgiw/8CuOXJhHSuYscEfAmPGoiG9+oLTYVc1NEfJEIyNuZULcr+aYlddTF79hm
V+Flq6FyA3NU220F8t5s3jOaDkWjWJ8nZGPUUxo5+yNHugIGYh/kLy6w8LC8SgLq
GqqYX4fd28tqFSgIBCrr+9GgpTE7bvzBGYLByKj9AO6ecLvWJmc+bENQCTaTRFgl
og6xenzyECWxgbWIql0UeB1xw2AJ8UfYVeLKzOHpc95ZF209+mg7JLL5yIxwwgNV
/CGoh28+twjX5SA1rr3cUx9gmFzrYubYZMglhgugBsShkdfuMLhis4woU7lF7cV9
HnxH6mkvN4R0Im7DZXgQPJ63ZFLJ8tN3RyLQDYBRd71w0Epr/K2aacYeQkWTflcx
4Ia+AiJ7rpKx0cUbUHX7pf3lzna/c8u/xPnlAIbR6rfwXO5mACupaofN5atAdx9T
9rPCPIdroM5XzBTiN4aNJHEsADL1h/oQdzrziTwryyezbTtnNC5KW53hnqyf5Bqo
gBlbfVsnwM0AfLHSPE1D0liOR2spwuB+/bWrsOCzEYENC44nDxHE/MUUjg7/l+Vr
6N5syrQ7QsIPqUaEM+bQdKHGaXSU6amF8OWpFMjzkleQw5m7/X8LzyZsBlB4yeqv
63hUEpdmFyR/6bLdEvjUXeAPcbA41WHwOMdNPaKDqn3zhwYZaa4=
=poyP
-----END PGP SIGNATURE-----
Merge tag 'x86_urgent_for_v5.18_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull x86 fixes from Borislav Petkov:
- A fix to disable PCI/MSI[-X] masking for XEN_HVM guests as that is
solely controlled by the hypervisor
- A build fix to make the function prototype (__warn()) as visible as
the definition itself
- A bunch of objtool annotation fixes which have accumulated over time
- An ORC unwinder fix to handle bad input gracefully
- Well, we thought the microcode gets loaded in time in order to
restore the microcode-emulated MSRs but we thought wrong. So there's
a fix for that to have the ordering done properly
- Add new Intel model numbers
- A spelling fix
* tag 'x86_urgent_for_v5.18_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
x86/pci/xen: Disable PCI/MSI[-X] masking for XEN_HVM guests
bug: Have __warn() prototype defined unconditionally
x86/Kconfig: fix the spelling of 'becoming' in X86_KERNEL_IBT config
objtool: Use offstr() to print address of missing ENDBR
objtool: Print data address for "!ENDBR" data warnings
x86/xen: Add ANNOTATE_NOENDBR to startup_xen()
x86/uaccess: Add ENDBR to __put_user_nocheck*()
x86/retpoline: Add ANNOTATE_NOENDBR for retpolines
x86/static_call: Add ANNOTATE_NOENDBR to static call trampoline
objtool: Enable unreachable warnings for CLANG LTO
x86,objtool: Explicitly mark idtentry_body()s tail REACHABLE
x86,objtool: Mark cpu_startup_entry() __noreturn
x86,xen,objtool: Add UNWIND hint
lib/strn*,objtool: Enforce user_access_begin() rules
MAINTAINERS: Add x86 unwinding entry
x86/unwind/orc: Recheck address range after stack info was updated
x86/cpu: Load microcode during restore_processor_state()
x86/cpu: Add new Alderlake and Raptorlake CPU model numbers
fallthrough detection and relocation handling of weak symbols when the
toolchain strips section symbols
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEzv7L6UO9uDPlPSfHEsHwGGHeVUoFAmJuckgACgkQEsHwGGHe
VUrnTw//TQ1gcAYX4vNibZvOYLRS090uvrnfrosCLBTlOLuPTnB71hTTCxaV6wPV
lXbW5n795G9XmQAkKyqRjNA2PHGKP+D187ooFwJjHW661+dQgdo4EhbRtR4s/IMW
Vd3ZRL0bmCImPKz4MrSVPEL0UotMHI2XYwr6Wf/kOmJ6nlTgmnVE3dI4sOkXQCtJ
ZMCtSm6XN4LTnYLgkP99AuPQe4tC2Fw/zXkFZWkm3Ku6xvEtyfSLLByli8Tqf4p9
mcVoLfBnvYc6ift/tBg9tGFTdw8BzQdmhvnwgMnouiA7bjuhEZ+ef7+LwEpg/5J6
tMNIeO9m8DzR1jZm2vuu+VHB+GwYonXhElJY8JbpGfvI/zjYhxHNdyx3Nn9Cpd7B
whxu7dRodUmI78/Ab3ywA+rDbMQw9ljT4254JhA/VeHxWuKodWU5PKRcS9nYSR+p
NNSSxWmzy4+3h4d9Twd35CWa7ALroepr4JjyEs54Xar7kmoZhiFg8/P0cD2u5ZtL
aBuDDOw8sQOzFHY8sQpYr4k4sI7VdA8fOBXJ0bllu962Gg1aujfuHlCP/ToRpJGc
2YXXUI0tWmOsn5pGI5ludAQ5B+M0j1JxrowEb+gPfuqk7hoN53c4fery4JjtrsJ5
0DPsSKq9SVY+SSLNTuTchQUBZcWAY3GXZYBHr8KuV+iY1zL/rCg=
=7nEx
-----END PGP SIGNATURE-----
Merge tag 'objtool_urgent_for_v5.18_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull objtool fixes from Borislav Petkov:
"A bunch of objtool fixes to improve unwinding, sibling call detection,
fallthrough detection and relocation handling of weak symbols when the
toolchain strips section symbols"
* tag 'objtool_urgent_for_v5.18_rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
objtool: Fix code relocs vs weak symbols
objtool: Fix type of reloc::addend
objtool: Fix function fallthrough detection for vmlinux
objtool: Fix sibling call detection in alternatives
objtool: Don't set 'jump_dest' for sibling calls
x86/uaccess: Don't jump between functions
- A fix for a regression caused by the previous set of bugfixes
changing tegra and at91 pinctrl properties. More work is needed
to figure out what this should actually be, but a revert makes
it work for the moment.
- Defconfig regression fixes for tegra after renamed symbols
- Build-time warning and static checker fixes for imx, op-tee,
sunxi, meson, at91, and omap
- More at91 DT fixes for audio, regulator and spi nodes
- A regression fix for Renesas Hyperflash memory probe
- A stability fix for amlogic boards, modifying the allowed
cpufreq states
- Multiple fixes for system suspend on omap2+
- DT fixes for various i.MX bugs
- A probe error fix for imx6ull-colibri MMC
- A MAINTAINERS file entry for samsung bug reports
-----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmJsVcUACgkQmmx57+YA
GNkvWA//WHU9udPtwJZFFeVqDqHcGF4KWu9Y0NtEHGbMFrPCepIMAqMe/EDoKNKn
Covl4h63XWwQbI82pAmSCY+cBK7zb9o5a0chXV0wZCZOvWxTRnOklJppyRtRzbPL
Nb2fh2Gbl9KFXSqnbMdAdCyeiEAe1MunCzTVfzzL8eyGLv0t5lyCChQZqkrQ+Axe
bnY93HucfULJh2H3J5hdGIdo3iklOigFq5ZvltSedaKaGl+pnKJ49KdyKSXT8jl0
N/grhpvYukIBvDvuowkav8/h0U+7nlLGEzVbnDBzSi4PYHmorY0S4tBjuTR87w2W
h/0xgdd3SPyBS19Q3dW/67Hx9O3UF0ecAaW2MK/wV+Y6nX68ip79E+zAN8LFwuQW
Lw53fyc/NgMBHMmAHBP8jvuedYAdYZ7tXgtPBSKLNIoDpbwaT5IxKD+E+0Vbf2vl
kHSPuo7e7zC2Mw+opf8J+hPOtG/mmGVNpwSq7RMyQx/AYD5h6g5M30dQcNgKoi0V
80isG8bEj0fdu4GMX0IW+lNEqrMz/pW6iB/mqHQbQbhNVgYiiQCeLmLHpXwlgriU
kRC8KAor5jKUn5IST7FjAa7FCEun2hWU7vS+Ye+aZPanxzu/4r8Zj4az31lEmGyT
1hBIiy0/1XuLiQ6mmqIAat7PhML9UKQIQuzNbpibdSR/2Llc4OY=
=GXTu
-----END PGP SIGNATURE-----
Merge tag 'soc-fixes-5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc
Pull ARM SoC fixes from Arnd Bergmann:
- A fix for a regression caused by the previous set of bugfixes
changing tegra and at91 pinctrl properties.
More work is needed to figure out what this should actually be, but a
revert makes it work for the moment.
- Defconfig regression fixes for tegra after renamed symbols
- Build-time warning and static checker fixes for imx, op-tee, sunxi,
meson, at91, and omap
- More at91 DT fixes for audio, regulator and spi nodes
- A regression fix for Renesas Hyperflash memory probe
- A stability fix for amlogic boards, modifying the allowed cpufreq
states
- Multiple fixes for system suspend on omap2+
- DT fixes for various i.MX bugs
- A probe error fix for imx6ull-colibri MMC
- A MAINTAINERS file entry for samsung bug reports
* tag 'soc-fixes-5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (42 commits)
Revert "arm: dts: at91: Fix boolean properties with values"
bus: sunxi-rsb: Fix the return value of sunxi_rsb_device_create()
Revert "arm64: dts: tegra: Fix boolean properties with values"
arm64: dts: imx8mn-ddr4-evk: Describe the 32.768 kHz PMIC clock
ARM: dts: imx6ull-colibri: fix vqmmc regulator
MAINTAINERS: add Bug entry for Samsung and memory controller drivers
memory: renesas-rpc-if: Fix HF/OSPI data transfer in Manual Mode
ARM: dts: logicpd-som-lv: Fix wrong pinmuxing on OMAP35
ARM: dts: am3517-evm: Fix misc pinmuxing
ARM: dts: am33xx-l4: Add missing touchscreen clock properties
ARM: dts: Fix mmc order for omap3-gta04
ARM: dts: at91: fix pinctrl phandles
ARM: dts: at91: sama5d4_xplained: fix pinctrl phandle name
ARM: dts: at91: Describe regulators on at91sam9g20ek
ARM: dts: at91: Map MCLK for wm8731 on at91sam9g20ek
ARM: dts: at91: Fix boolean properties with values
ARM: dts: at91: use generic node name for dataflash
ARM: dts: at91: align SPI NOR node name with dtschema
ARM: dts: at91: sama7g5ek: Align the impedance of the QSPI0's HSIO and PCB lines
ARM: dts: at91: sama7g5ek: enable pull-up on flexcom3 console lines
...
touching the core so these fixes are fairly well contained to specific
devices that use these clk drivers.
- Some Allwinner SoC fixes to gracefully handle errors and mark an RTC
clk as critical so that the RTC keeps ticking.
- Fix AXI bus clks and RTC clk design for Microchip PolarFire SoC
driver introduced this cycle. This has some devicetree bits acked by
riscv maintainers. We're fixing it now so that the prior bindings
aren't released in a major kernel version.
- Remove a reset on Microchip PolarFire SoCs that broke when enabling
CONFIG_PM.
- Set a min/max for the Qualcomm graphics clk. This got broken by the
clk rate range patches introduced this cycle.
-----BEGIN PGP SIGNATURE-----
iQJFBAABCAAvFiEE9L57QeeUxqYDyoaDrQKIl8bklSUFAmJsTdQRHHNib3lkQGtl
cm5lbC5vcmcACgkQrQKIl8bklSWi/xAAizhBY4W/UdIVskDIBafBwa1WFejIHUWn
KH4NmQzfo4P1IGjLId3V9wspKD4BF/F+WT4nxkqYg5WQc2KWfKTB2co7GbJXP0FC
2pYNF+wstfTo6jiwmOHdulgt3ZAHamfm9kukUbmHxCJBVbvJORGVOQmqwQMSTLmQ
YozTee2iCwKlfDHEQAzz6G8kp1c8Uo6IPbl+sarvjvzpEuox0r8d+TN+VJfSiFDo
8/Exi06s9R9qtzNHs3ffFFDZpkOcRxj5KnuI2d3B04hJ7zf0E7GPYAyJ7Kt3Lhk5
jyqPkCox2mObBxDVZX7nJqfwaXYGNdMOWSguONIv3VQ+PnYuEc1rZ6oT8UEw+kkb
2GQkp/ZIJVrRnydm/HPuTYUvs2tM0HJ1k6a9nlPioPvGIqnkXcrQONoM/qhLQelI
PnkvdRChPa/W+JPxR7Su6BnNXcEpDG6+8NXfpPUbax9cOUFT2Oo+wIZomuCL0LJc
7+kL28wNEuG/A3Dd3J9hLzc8D844IrungpilgFZRoJNy9C4InNnmTLRKcYVbzjPj
xdt6x9TJItLPwplkjuHuHJUSjJRd5czz45objGDHN6JAG3R2cDFEt0cqQRwEXjgO
ItwHhpUQA3S9oonDzcxZqVU2Jz2XlbDw5ryAqxIj7GWYhB0RuH7NMdRo6hX1jbtR
Am4PVq4XEuk=
=i8nA
-----END PGP SIGNATURE-----
Merge tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux
Pull clk fixes from Stephen Boyd:
"A semi-large pile of clk driver fixes this time around.
Nothing is touching the core so these fixes are fairly well contained
to specific devices that use these clk drivers.
- Some Allwinner SoC fixes to gracefully handle errors and mark an
RTC clk as critical so that the RTC keeps ticking.
- Fix AXI bus clks and RTC clk design for Microchip PolarFire SoC
driver introduced this cycle. This has some devicetree bits acked
by riscv maintainers. We're fixing it now so that the prior
bindings aren't released in a major kernel version.
- Remove a reset on Microchip PolarFire SoCs that broke when enabling
CONFIG_PM.
- Set a min/max for the Qualcomm graphics clk. This got broken by the
clk rate range patches introduced this cycle"
* tag 'clk-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/clk/linux:
clk: sunxi: sun9i-mmc: check return value after calling platform_get_resource()
clk: sunxi-ng: sun6i-rtc: Mark rtc-32k as critical
riscv: dts: microchip: reparent mpfs clocks
clk: microchip: mpfs: add RTCREF clock control
clk: microchip: mpfs: re-parent the configurable clocks
dt-bindings: rtc: add refclk to mpfs-rtc
dt-bindings: clk: mpfs: add defines for two new clocks
dt-bindings: clk: mpfs document msspll dri registers
riscv: dts: microchip: fix usage of fic clocks on mpfs
clk: microchip: mpfs: mark CLK_ATHENA as critical
clk: microchip: mpfs: fix parents for FIC clocks
clk: qcom: clk-rcg2: fix gfx3d frequency calculation
clk: microchip: mpfs: don't reset disabled peripherals
clk: sunxi-ng: fix not NULL terminated coccicheck error
This reverts commit 0dc23d1a8e17, which caused another regression
as the pinctrl code actually expects an integer value of 0 or 1
rather than a simple boolean property.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Synthesizing AMD leaves up to 0x80000021 caused problems with QEMU,
which assumes the *host* CPUID[0x80000000].EAX is higher or equal
to what KVM_GET_SUPPORTED_CPUID reports.
This causes QEMU to issue bogus host CPUIDs when preparing the input
to KVM_SET_CPUID2. It can even get into an infinite loop, which is
only terminated by an abort():
cpuid_data is full, no space for cpuid(eax:0x8000001d,ecx:0x3e)
To work around this, only synthesize those leaves if 0x8000001d exists
on the host. The synthetic 0x80000021 leaf is mostly useful on Zen2,
which satisfies the condition.
Fixes: f144c49e8c39 ("KVM: x86: synthesize CPUID leaf 0x80000021h if useful")
Reported-by: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* A fix to properly ensure a single CPU is running during patch_text().
* A defconfig update to include RPMSG_CTRL when RPMSG_CHAR was set,
necessary after a recent refactoring.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmJsAHYTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYiZP7D/9ASxIyJFLpiyVisC/5WnCJHmWUXWJC
DVlXXfR0U6mjFZcvWYvVTIpBa37qUb6p1w85NOK1+zE3gURyTWdAsA5dXxK3PWGr
zZhVaNPGlQx7nzI675z1JKFEp+EcE1UjzLKL+jdkwY+ep/BSc2vMN0Kj5cbW3L/j
/Bsi1HK8LdMxG9uTCxdDYs3qKlIh/ZgW0Lvnj3sjV8tqW4D4iXMqcL8l/ydpDnwL
amYrc4erQyOKA7kbYDPMxUx/A8SVrXepHowUz6lpyJLCC8tdmVwI7rjImb5FzLJK
idTX7ZAnUiT68Jom3VOhqGYVRS/uZHxr3kLtkNtPAm+yHafvNRnYPrqmhTpTV4BP
eZpvUq5a90NoX+Xi1s0ou+m0LGbIzXOKVcC8ffnxd6I3EGQiWYvJyjwi1LEGFLMB
iP+iT5GlzWbrAsrXaHGGVvc2TymfaxL4VglapBjRYSLV6Evg0QorZbyLrNepBKaL
XO61miNN6FFETkKZ29vd1KAzEmT8/Gf1APJwRKsRtcfAE1ndm0lJ/j5CRF5rP45K
cp9zl/i/jthZSBR7NjQDCIXNlK+MvAs9HtwuDIP+vgVyfTQDZJcNfeTnSJdf46KV
dHtn9eu00fMNEIiFvFbYyWGfcHyKwGziIUWDxe7OtLbmyiHWzKdVZMR29WTmJHpI
0KMTBoUya4yAbA==
=BK1v
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes from Palmer Dabbelt:
- A fix to properly ensure a single CPU is running during patch_text().
- A defconfig update to include RPMSG_CTRL when RPMSG_CHAR was set,
necessary after a recent refactoring.
* tag 'riscv-for-linus-5.18-rc5' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: configs: Configs that had RPMSG_CHAR now get RPMSG_CTRL
riscv: patch_text: Fixup last cpu should be master
- Rename and reallocate the PT_ARM_MEMTAG_MTE ELF segment type
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmJryRkQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNOS/B/43vhg0Bz0sH9poU0r3YE6TfEeCTd2HwCBq
Caf6VAM6K9Y7JwPSFkAgnmW2DOG1hif83mCaX47Dkb4bJZaUQpmzJ+ZsK0nCe/Wo
GIBzTdjRLrl6LQAqcW860s2eE4Ac212LszDxdp5DXFp4HjzbSfsteUFWipnpyXEs
KtA4Z991fbsj+FsWY42zqhFkMnnPO+6twT5okYN/S6tONrojCTZhRTKupBr4VRAk
94xD+pJI/8EfyakEhNN2PabhCvlq0bEI/EB6pJMrL6GjMs5t0pbpowXHzMoctZuf
/SkRgyILDNUPbAa6wGCJoV4ZcgFH7PDy80iDMwNscb/DgAO8xo6t
=hafH
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fix from Will Deacon:
"Rename and reallocate the PT_ARM_MEMTAG_MTE ELF segment type.
This is a fix to the MTE ELF ABI for a bug that was added during the
most recent merge window as part of the coredump support.
The issue is that the value assigned to the new PT_ARM_MEMTAG_MTE
segment type has already been allocated to PT_AARCH64_UNWIND by the
ELF ABI, so we've bumped the value and changed the name of the
identifier to be better aligned with the existing one"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
elf: Fix the arm64 MTE ELF segment name and value
Drop lookup_address_in_mm() now that KVM is providing it's own variant
of lookup_address_in_pgd() that is safe for use with user addresses, e.g.
guards against page tables being torn down. A variant that provides a
non-init mm is inherently dangerous and flawed, as the only reason to use
an mm other than init_mm is to walk a userspace mapping, and
lookup_address_in_pgd() does not play nice with userspace mappings, e.g.
doesn't disable IRQs to block TLB shootdowns and doesn't use READ_ONCE()
to ensure an upper level entry isn't converted to a huge page between
checking the PAGE_SIZE bit and grabbing the address of the next level
down.
This reverts commit 13c72c060f1ba6f4eddd7b1c4f52a8aded43d6d9.
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <YmwIi3bXr/1yhYV/@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fixes for (relatively) old bugs, to be merged in both the -rc and next
development trees:
* Fix potential races when walking host page table
* Fix bad user ABI for KVM_EXIT_SYSTEM_EVENT
* Fix shadow page table leak when KVM runs nested
KVM uses lookup_address_in_mm() to detect the hugepage size that the host
uses to map a pfn. The function suffers from several issues:
- no usage of READ_ONCE(*). This allows multiple dereference of the same
page table entry. The TOCTOU problem because of that may cause KVM to
incorrectly treat a newly generated leaf entry as a nonleaf one, and
dereference the content by using its pfn value.
- the information returned does not match what KVM needs; for non-present
entries it returns the level at which the walk was terminated, as long
as the entry is not 'none'. KVM needs level information of only 'present'
entries, otherwise it may regard a non-present PXE entry as a present
large page mapping.
- the function is not safe for mappings that can be torn down, because it
does not disable IRQs and because it returns a PTE pointer which is never
safe to dereference after the function returns.
So implement the logic for walking host page tables directly in KVM, and
stop using lookup_address_in_mm().
Cc: Sean Christopherson <seanjc@google.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20220429031757.2042406-1-mizhang@google.com>
[Inline in host_pfn_mapping_level, ensure no semantic change for its
callers. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When KVM_EXIT_SYSTEM_EVENT was introduced, it included a flags
member that at the time was unused. Unfortunately this extensibility
mechanism has several issues:
- x86 is not writing the member, so it would not be possible to use it
on x86 except for new events
- the member is not aligned to 64 bits, so the definition of the
uAPI struct is incorrect for 32- on 64-bit userspace. This is a
problem for RISC-V, which supports CONFIG_KVM_COMPAT, but fortunately
usage of flags was only introduced in 5.18.
Since padding has to be introduced, place a new field in there
that tells if the flags field is valid. To allow further extensibility,
in fact, change flags to an array of 16 values, and store how many
of the values are valid. The availability of the new ndata field
is tied to a system capability; all architectures are changed to
fill in the field.
To avoid breaking compilation of userspace that was using the flags
field, provide a userspace-only union to overlap flags with data[0].
The new field is placed at the same offset for both 32- and 64-bit
userspace.
Cc: Will Deacon <will@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Peter Gonda <pgonda@google.com>
Cc: Sean Christopherson <seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: kernel test robot <lkp@intel.com>
Message-Id: <20220422103013.34832-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Disallow memslots and MMIO SPTEs whose gpa range would exceed the host's
MAXPHYADDR, i.e. don't create SPTEs for gfns that exceed host.MAXPHYADDR.
The TDP MMU bounds its zapping based on host.MAXPHYADDR, and so if the
guest, possibly with help from userspace, manages to coerce KVM into
creating a SPTE for an "impossible" gfn, KVM will leak the associated
shadow pages (page tables):
WARNING: CPU: 10 PID: 1122 at arch/x86/kvm/mmu/tdp_mmu.c:57
kvm_mmu_uninit_tdp_mmu+0x4b/0x60 [kvm]
Modules linked in: kvm_intel kvm irqbypass
CPU: 10 PID: 1122 Comm: set_memory_regi Tainted: G W 5.18.0-rc1+ #293
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
RIP: 0010:kvm_mmu_uninit_tdp_mmu+0x4b/0x60 [kvm]
Call Trace:
<TASK>
kvm_arch_destroy_vm+0x130/0x1b0 [kvm]
kvm_destroy_vm+0x162/0x2d0 [kvm]
kvm_vm_release+0x1d/0x30 [kvm]
__fput+0x82/0x240
task_work_run+0x5b/0x90
exit_to_user_mode_prepare+0xd2/0xe0
syscall_exit_to_user_mode+0x1d/0x40
entry_SYSCALL_64_after_hwframe+0x44/0xae
</TASK>
On bare metal, encountering an impossible gpa in the page fault path is
well and truly impossible, barring CPU bugs, as the CPU will signal #PF
during the gva=>gpa translation (or a similar failure when stuffing a
physical address into e.g. the VMCS/VMCB). But if KVM is running as a VM
itself, the MAXPHYADDR enumerated to KVM may not be the actual MAXPHYADDR
of the underlying hardware, in which case the hardware will not fault on
the illegal-from-KVM's-perspective gpa.
Alternatively, KVM could continue allowing the dodgy behavior and simply
zap the max possible range. But, for hosts with MAXPHYADDR < 52, that's
a (minor) waste of cycles, and more importantly, KVM can't reasonably
support impossible memslots when running on bare metal (or with an
accurate MAXPHYADDR as a VM). Note, limiting the overhead by checking if
KVM is running as a guest is not a safe option as the host isn't required
to announce itself to the guest in any way, e.g. doesn't need to set the
HYPERVISOR CPUID bit.
A second alternative to disallowing the memslot behavior would be to
disallow creating a VM with guest.MAXPHYADDR > host.MAXPHYADDR. That
restriction is undesirable as there are legitimate use cases for doing
so, e.g. using the highest host.MAXPHYADDR out of a pool of heterogeneous
systems so that VMs can be migrated between hosts with different
MAXPHYADDRs without running afoul of the allow_smaller_maxphyaddr mess.
Note that any guest.MAXPHYADDR is valid with shadow paging, and it is
even useful in order to test KVM with MAXPHYADDR=52 (i.e. without
any reserved physical address bits).
The now common kvm_mmu_max_gfn() is inclusive instead of exclusive.
The memslot and TDP MMU code want an exclusive value, but the name
implies the returned value is inclusive, and the MMIO path needs an
inclusive check.
Fixes: faaf05b00aec ("kvm: x86/mmu: Support zapping SPTEs in the TDP MMU")
Fixes: 524a1e4e381f ("KVM: x86/mmu: Don't leak non-leaf SPTEs when zapping all SPTEs")
Cc: stable@vger.kernel.org
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Cc: Ben Gardon <bgardon@google.com>
Cc: David Matlack <dmatlack@google.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Message-Id: <20220428233416.2446833-1-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
- Take care of faults occuring between the PARange and
IPA range by injecting an exception
- Fix S2 faults taken from a host EL0 in protected mode
- Work around Oops caused by a PMU access from a 32bit
guest when PMU has been created. This is a temporary
bodge until we fix it for good.
-----BEGIN PGP SIGNATURE-----
iQJDBAABCgAtFiEEn9UcU+C1Yxj9lZw9I9DQutE9ekMFAmJsA9kPHG1hekBrZXJu
ZWwub3JnAAoJECPQ0LrRPXpD/FYP/1G0ctyOaZDWC90sZVtBmVfO9Sc1xiZ2Kael
3+79N96ncFHp8oKTDz4zcSvs2hDG+9tUhG54MVr1DnPuO5rQubEGr+inYnb1cQfr
j0qBxvVAyP9wnQmtPeTvbHJCumB3RCm5d1tDLhmr4vnh5vBOn36fpCEhKhuv/2ib
xsXE8rhL0PjRV+NV5KAlO5bJuUU0F8OaZ8612yyO0ExtV+6t1WQGsIBhf3aBRhAY
oaYG2CmSgsifW9fgXdz/CVzRdPGV4E/hhAcQbY3CmzMbzlO6tqbOIK1NoMivoNIg
smKMJNhnaVLSybprAcQvAG9H1vbGSQWLurd42EvZIzn59+ILFHW90fYmW4QM69P3
mqOW40RXlZ/8BwIVEftzyZ8J8XNTpaDeGZ4D1aHY6yvKTd7ApiiU3OZOrYgZ7H/a
d/2W3tQfV9MlifJjNFZDWj8fMqxLifcZksvMfwbl775B/9SO66UHx00qFhZxQ5nI
7A8RKyB2jRRM/ndb337XMWoHQVwjof4CUZG1XuLSyYtb1K9joROoLD0hiNYTJOek
QxT3T+g5FF3YpxrB270lKA4xJ6fUqUgAisThY5AZnK3Ezw8udx18ADT+X76/OqqQ
GTxG3v45YObjA2t4xSk8TszFg83JubsCGKYQLKEbdMsfBIkhvT+fyrW4fCxODGHH
r/rK7iuK
=Ejoh
-----END PGP SIGNATURE-----
Merge tag 'kvmarm-fixes-5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/kvmarm/kvmarm into HEAD
KVM/arm64 fixes for 5.18, take #2
- Take care of faults occuring between the PARange and
IPA range by injecting an exception
- Fix S2 faults taken from a host EL0 in protected mode
- Work around Oops caused by a PMU access from a 32bit
guest when PMU has been created. This is a temporary
bodge until we fix it for good.
This contains two updates to the default configuration needed because of
a Kconfig symbol name change. This fixes a failure that was detected in
the NVIDIA automated test farm.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEiOrDCAFJzPfAjcif3SOs138+s6EFAmJrm1UTHHRyZWRpbmdA
bnZpZGlhLmNvbQAKCRDdI6zXfz6zobdLD/wM72K2ZSTaYglcq3XH/UcoGFxdxX4T
4HT3KtuudmLDuTHdSJXzWKHjcYn2Qkn6B7APBIihyXt74sfC1ZrrTv4RafkWGAnj
SLMAL3U4XwHrSoc8L/sjCBE61Rybn+0hPR5gputvxJHFoX4WS+hNYV8o3mdJBXME
tc8KJZhm1EPaa4fl3ep8lwzJBOgoNkv6yOk/k0skR7eGmz8iIMsH8QoQlzoMRD6f
dt2Hih5nzoZKL8HKz2H0ct24eAp3ERXcrce5yCPChMO19/HTwHnjvICwtlxkJ3lm
OAm13I1naURAwqyvdnH6KHb6gi1TINHY1MM3IIf6iLzfuY9iZELBS3AiT78GF3HA
PI4sPl8OgODLBrZTiqIuDVWLEsje5kSNhb80a+OLp9KUuaGybdkogBulHXE0OPIf
eafDRK23iJ7G8vu744uvdI9CcWdpKLTPOHlbvu9HVzVWsLdFUlUxSM2N3dg7XaBX
jn4LWsx5OA95cMmnpcYVLs11DmbpQ9fVjpJDBXqgQqb+ra3bmEL312g9j0ReKick
GV4HUR+ZDrZ9U+X60Fp3Lm+FOAjeZ8ZkuPUeI5CpaT57eGUieu67uUvAClrA970J
s3Ptai5JX8KffBmD4/H7d5spz30BPjqv88OVhYR4GPSwP10TW9qedxCQWP5snuJz
xKYuJ0BjdtyaXw==
=AH48
-----END PGP SIGNATURE-----
gpgsig -----BEGIN PGP SIGNATURE-----
iQIzBAABCgAdFiEEo6/YBQwIrVS28WGKmmx57+YAGNkFAmJr+RIACgkQmmx57+YA
GNkxsQ/9E3hL2ClQtrmTr3Zronjk/rtg++G0/V4LjWTtKD6UjcsbJU9hMDCROmw+
FFTfzzeQxq5byvC19VByh6nlmMpFJ8dyrMfhNG1TKTiRyFkZKTwM67ZqneAgb84T
8jaVp9P/iOvdcFUBBL7zDbsIZPjrSPDRiLZmw13y3DkWgFhvM3ixGhv7+tMaXYQU
Yc3VPH9uurZ5Wdk52fX1Iu3hbRwaJw4lFVW+RSKX2O3DznaLCXHF1k/KTtWG2i+l
PaNClQaCWYHG4cinGMKcXkvyhsUJeIcqVP9ESxAa7z7w69wN986tZBwSQbtnw/nR
/lOHjo1mKpe9DU2C4LAxNHwdRWkg9j6In8Y5mccDPgDHHaJWIeVlmutmt/GYB1ek
Gh2vqaqmEZXQr8FGL++sISLB3aspNyknXabtHHztKOPR7DBYyesLtB9L1JSCDVXN
krqcVBO4+yQyhDjAlWLmG2xY+SOyNe158xFrRReUn54/u5WE/fJyFiRc/Zgt9cyD
P4G14iHjVwDNshIKx4pi4aWL9gSf+zc/LIvGG19aW0Z8K5s7lXNYjUMgPZCg8XHX
+H1s67AxoKLVv/+GWZ4Cg6PYH4yTw9PSSn3oXqaZfQ8L9LtEMpqri1Vn0eEq4AbH
OxcPRt3C3GvpekTE7xLi07tTEXd7mnrYtRoCpbBfpZc/+R0kQms=
=12TT
-----END PGP SIGNATURE-----
Merge tag 'tegra-for-5.18-arm-defconfig-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux into arm/fixes
ARM: tegra: Default configuration fixes for v5.18
This contains two updates to the default configuration needed because of
a Kconfig symbol name change. This fixes a failure that was detected in
the NVIDIA automated test farm.
* tag 'tegra-for-5.18-arm-defconfig-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tegra/linux:
ARM: config: multi v7: Enable NVIDIA Tegra video decoder driver
ARM: tegra_defconfig: Update CONFIG_TEGRA_VDE option
Link: https://lore.kernel.org/r/20220429080626.494150-1-thierry.reding@gmail.com
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
When a XEN_HVM guest uses the XEN PIRQ/Eventchannel mechanism, then
PCI/MSI[-X] masking is solely controlled by the hypervisor, but contrary to
XEN_PV guests this does not disable PCI/MSI[-X] masking in the PCI/MSI
layer.
This can lead to a situation where the PCI/MSI layer masks an MSI[-X]
interrupt and the hypervisor grants the write despite the fact that it
already requested the interrupt. As a consequence interrupt delivery on the
affected device is not happening ever.
Set pci_msi_ignore_mask to prevent that like it's done for XEN_PV guests
already.
Fixes: 809f9267bbab ("xen: map MSIs into pirqs")
Reported-by: Jeremi Piotrowski <jpiotrowski@linux.microsoft.com>
Reported-by: Dusty Mabe <dustymabe@redhat.com>
Reported-by: Salvatore Bonaccorso <carnil@debian.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Noah Meyerhans <noahm@debian.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87tuaduxj5.ffs@tglx
Unfortunately, the name/value choice for the MTE ELF segment type
(PT_ARM_MEMTAG_MTE) was pretty poor: LOPROC+1 is already in use by
PT_AARCH64_UNWIND, as defined in the AArch64 ELF ABI
(https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst).
Update the ELF segment type value to LOPROC+2 and also change the define
to PT_AARCH64_MEMTAG_MTE to match the AArch64 ELF ABI namespace. The
AArch64 ELF ABI document is updating accordingly (segment type not
previously mentioned in the document).
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Fixes: 761b9b366cec ("elf: Introduce the ARM MTE ELF segment type")
Cc: Will Deacon <will@kernel.org>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Eric Biederman <ebiederm@xmission.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Luis Machado <luis.machado@arm.com>
Cc: Richard Earnshaw <Richard.Earnshaw@arm.com>
Link: https://lore.kernel.org/r/20220425151833.2603830-1-catalin.marinas@arm.com
Signed-off-by: Will Deacon <will@kernel.org>
When taking a translation fault for an IPA that is outside of
the range defined by the hypervisor (between the HW PARange and
the IPA range), we stupidly treat it as an IO and forward the access
to userspace. Of course, userspace can't do much with it, and things
end badly.
Arguably, the guest is braindead, but we should at least catch the
case and inject an exception.
Check the faulting IPA against:
- the sanitised PARange: inject an address size fault
- the IPA size: inject an abort
Reported-by: Christoffer Dall <christoffer.dall@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
kvm->arch.arm_pmu is set when userspace attempts to set the first PMU
attribute. As certain attributes are mandatory, arm_pmu ends up always
being set to a valid arm_pmu, otherwise KVM will refuse to run the VCPU.
However, this only happens if the VCPU has the PMU feature. If the VCPU
doesn't have the feature bit set, kvm->arch.arm_pmu will be left
uninitialized and equal to NULL.
KVM doesn't do ID register emulation for 32-bit guests and accesses to the
PMU registers aren't gated by the pmu_visibility() function. This is done
to prevent injecting unexpected undefined exceptions in guests which have
detected the presence of a hardware PMU. But even though the VCPU feature
is missing, KVM still attempts to emulate certain aspects of the PMU when
PMU registers are accessed. This leads to a NULL pointer dereference like
this one, which happens on an odroid-c4 board when running the
kvm-unit-tests pmu-cycle-counter test with kvmtool and without the PMU
feature being set:
[ 454.402699] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000150
[ 454.405865] Mem abort info:
[ 454.408596] ESR = 0x96000004
[ 454.411638] EC = 0x25: DABT (current EL), IL = 32 bits
[ 454.416901] SET = 0, FnV = 0
[ 454.419909] EA = 0, S1PTW = 0
[ 454.423010] FSC = 0x04: level 0 translation fault
[ 454.427841] Data abort info:
[ 454.430687] ISV = 0, ISS = 0x00000004
[ 454.434484] CM = 0, WnR = 0
[ 454.437404] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000c924000
[ 454.443800] [0000000000000150] pgd=0000000000000000, p4d=0000000000000000
[ 454.450528] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 454.456036] Modules linked in:
[ 454.459053] CPU: 1 PID: 267 Comm: kvm-vcpu-0 Not tainted 5.18.0-rc4 #113
[ 454.465697] Hardware name: Hardkernel ODROID-C4 (DT)
[ 454.470612] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 454.477512] pc : kvm_pmu_event_mask.isra.0+0x14/0x74
[ 454.482427] lr : kvm_pmu_set_counter_event_type+0x2c/0x80
[ 454.487775] sp : ffff80000a9839c0
[ 454.491050] x29: ffff80000a9839c0 x28: ffff000000a83a00 x27: 0000000000000000
[ 454.498127] x26: 0000000000000000 x25: 0000000000000000 x24: ffff00000a510000
[ 454.505198] x23: ffff000000a83a00 x22: ffff000003b01000 x21: 0000000000000000
[ 454.512271] x20: 000000000000001f x19: 00000000000003ff x18: 0000000000000000
[ 454.519343] x17: 000000008003fe98 x16: 0000000000000000 x15: 0000000000000000
[ 454.526416] x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000
[ 454.533489] x11: 000000008003fdbc x10: 0000000000009d20 x9 : 000000000000001b
[ 454.540561] x8 : 0000000000000000 x7 : 0000000000000d00 x6 : 0000000000009d00
[ 454.547633] x5 : 0000000000000037 x4 : 0000000000009d00 x3 : 0d09000000000000
[ 454.554705] x2 : 000000000000001f x1 : 0000000000000000 x0 : 0000000000000000
[ 454.561779] Call trace:
[ 454.564191] kvm_pmu_event_mask.isra.0+0x14/0x74
[ 454.568764] kvm_pmu_set_counter_event_type+0x2c/0x80
[ 454.573766] access_pmu_evtyper+0x128/0x170
[ 454.577905] perform_access+0x34/0x80
[ 454.581527] kvm_handle_cp_32+0x13c/0x160
[ 454.585495] kvm_handle_cp15_32+0x1c/0x30
[ 454.589462] handle_exit+0x70/0x180
[ 454.592912] kvm_arch_vcpu_ioctl_run+0x1c4/0x5e0
[ 454.597485] kvm_vcpu_ioctl+0x23c/0x940
[ 454.601280] __arm64_sys_ioctl+0xa8/0xf0
[ 454.605160] invoke_syscall+0x48/0x114
[ 454.608869] el0_svc_common.constprop.0+0xd4/0xfc
[ 454.613527] do_el0_svc+0x28/0x90
[ 454.616803] el0_svc+0x34/0xb0
[ 454.619822] el0t_64_sync_handler+0xa4/0x130
[ 454.624049] el0t_64_sync+0x18c/0x190
[ 454.627675] Code: a9be7bfd 910003fd f9000bf3 52807ff3 (b9415001)
[ 454.633714] ---[ end trace 0000000000000000 ]---
In this particular case, Linux hasn't detected the presence of a hardware
PMU because the PMU node is missing from the DTB, so userspace would have
been unable to set the VCPU PMU feature even if it attempted it. What
happens is that the 32-bit guest reads ID_DFR0, which advertises the
presence of the PMU, and when it tries to program a counter, it triggers
the NULL pointer dereference because kvm->arch.arm_pmu is NULL.
kvm-arch.arm_pmu was introduced by commit 46b187821472 ("KVM: arm64:
Keep a per-VM pointer to the default PMU"). Until that commit, this
error would be triggered instead:
[ 73.388140] ------------[ cut here ]------------
[ 73.388189] Unknown PMU version 0
[ 73.390420] WARNING: CPU: 1 PID: 264 at arch/arm64/kvm/pmu-emul.c:36 kvm_pmu_event_mask.isra.0+0x6c/0x74
[ 73.399821] Modules linked in:
[ 73.402835] CPU: 1 PID: 264 Comm: kvm-vcpu-0 Not tainted 5.17.0 #114
[ 73.409132] Hardware name: Hardkernel ODROID-C4 (DT)
[ 73.414048] pstate: 60400009 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
[ 73.420948] pc : kvm_pmu_event_mask.isra.0+0x6c/0x74
[ 73.425863] lr : kvm_pmu_event_mask.isra.0+0x6c/0x74
[ 73.430779] sp : ffff80000a8db9b0
[ 73.434055] x29: ffff80000a8db9b0 x28: ffff000000dbaac0 x27: 0000000000000000
[ 73.441131] x26: ffff000000dbaac0 x25: 00000000c600000d x24: 0000000000180720
[ 73.448203] x23: ffff800009ffbe10 x22: ffff00000b612000 x21: 0000000000000000
[ 73.455276] x20: 000000000000001f x19: 0000000000000000 x18: ffffffffffffffff
[ 73.462348] x17: 000000008003fe98 x16: 0000000000000000 x15: 0720072007200720
[ 73.469420] x14: 0720072007200720 x13: ffff800009d32488 x12: 00000000000004e6
[ 73.476493] x11: 00000000000001a2 x10: ffff800009d32488 x9 : ffff800009d32488
[ 73.483565] x8 : 00000000ffffefff x7 : ffff800009d8a488 x6 : ffff800009d8a488
[ 73.490638] x5 : ffff0000f461a9d8 x4 : 0000000000000000 x3 : 0000000000000001
[ 73.497710] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000000dbaac0
[ 73.504784] Call trace:
[ 73.507195] kvm_pmu_event_mask.isra.0+0x6c/0x74
[ 73.511768] kvm_pmu_set_counter_event_type+0x2c/0x80
[ 73.516770] access_pmu_evtyper+0x128/0x16c
[ 73.520910] perform_access+0x34/0x80
[ 73.524532] kvm_handle_cp_32+0x13c/0x160
[ 73.528500] kvm_handle_cp15_32+0x1c/0x30
[ 73.532467] handle_exit+0x70/0x180
[ 73.535917] kvm_arch_vcpu_ioctl_run+0x20c/0x6e0
[ 73.540489] kvm_vcpu_ioctl+0x2b8/0x9e0
[ 73.544283] __arm64_sys_ioctl+0xa8/0xf0
[ 73.548165] invoke_syscall+0x48/0x114
[ 73.551874] el0_svc_common.constprop.0+0xd4/0xfc
[ 73.556531] do_el0_svc+0x28/0x90
[ 73.559808] el0_svc+0x28/0x80
[ 73.562826] el0t_64_sync_handler+0xa4/0x130
[ 73.567054] el0t_64_sync+0x1a0/0x1a4
[ 73.570676] ---[ end trace 0000000000000000 ]---
[ 73.575382] kvm: pmu event creation failed -2
The root cause remains the same: kvm->arch.pmuver was never set to
something sensible because the VCPU feature itself was never set.
The odroid-c4 is somewhat of a special case, because Linux doesn't probe
the PMU. But the above errors can easily be reproduced on any hardware,
with or without a PMU driver, as long as userspace doesn't set the PMU
feature.
Work around the fact that KVM advertises a PMU even when the VCPU feature
is not set by gating all PMU emulation on the feature. The guest can still
access the registers without KVM injecting an undefined exception.
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220425145530.723858-1-alexandru.elisei@arm.com
When pKVM is enabled, host memory accesses are translated by an identity
mapping at stage-2, which is populated lazily in response to synchronous
exceptions from 64-bit EL1 and EL0.
Extend this handling to cover exceptions originating from 32-bit EL0 as
well. Although these are very unlikely to occur in practice, as the
kernel typically ensures that user pages are initialised before mapping
them in, drivers could still map previously untouched device pages into
userspace and expect things to work rather than panic the system.
Cc: Quentin Perret <qperret@google.com>
Cc: Marc Zyngier <maz@kernel.org>
Signed-off-by: Will Deacon <will@kernel.org>
Signed-off-by: Marc Zyngier <maz@kernel.org>
Link: https://lore.kernel.org/r/20220427171332.13635-1-will@kernel.org
- Fix some register offsets on Intel Alderlake
- Fix the order the UFS and SDC pins on Qualcomm SM6350
- Fix a build error in Mediatek Moore.
- Fix a pin function table in the Sunplus SP7021.
- Fix some Kconfig and static keywords on the Samsung
Tesla FSD SoC.
- Fix up the EOI function for edge triggered IRQs and keep
the block clock enabled for level IRQs in the
STM32 driver.
- Fix some bits and order in the Rockchip RK3308 driver.
- Handle the errorpath in the Pistachio driver probe()
properly.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEElDRnuGcz/wPCXQWMQRCzN7AZXXMFAmJobdYACgkQQRCzN7AZ
XXMltw/+JuwmtaJOOARGp7oKy3s0L0eV4lquF/a3cvGjViua0HPWlYd8aFAGnSiq
/a0wQT5pmKqdg0HEgVFxNyxe6s7P8RXpFz5kA0jtO85O6/l6GZDnix9kQ7Qdwmin
dP0JGBxr05UbejOMrO6CRSjDWPX+ptPO1DtCYWZC2A3qpwXTYdrnq568uH9dePby
xpqeJ35upIK7DfLBTCa2FaQWcXM3IQsAuUB84gN4oi1UXdimXAXqExRx77vTFrM2
pXmOgKF9l9inAUJnF/X3aM3GeR44x1uAH2mxpMupAWe81WQTn997O2B/odc4Arw1
LpIbBgQPY8+R2Go2CqJ/UKGYqh0jABiZV3DdGEG+Gvex/qfiT6yW4AJvaIoaI6ZT
A9L2aZdw6Q6U7OR6wyQ6CFNxpHeyxjjEbRBYhlQW0Bzb/IZnqtIJagv6M5It0OhI
+FJ4poIPKS5jTsdjUCk5AbWuFP8GMhDE+8eRtBg6CEHsp7+0unGlhAvmztyrtW6G
lGUQBw5OsvKqyPujitmSVWkKC99PhQhd7BR9feSKLP7+9Is7Wo45hamZXbJF5+bz
rqznr4jRMZb7/mUWRLg8wXkmrSaKfzOo9yDyMix5JQw6JyOEw8oyeweI8Mfa+JbG
cRa6PYoVNpHZr+BVqLxMnTTbz/5dtV/NT6H2Ny8c+UYKod8eA4w=
=xFvm
-----END PGP SIGNATURE-----
Merge tag 'pinctrl-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Pull pin control fixes from Linus Walleij:
- Fix some register offsets on Intel Alderlake
- Fix the order the UFS and SDC pins on Qualcomm SM6350
- Fix a build error in Mediatek Moore.
- Fix a pin function table in the Sunplus SP7021.
- Fix some Kconfig and static keywords on the Samsung Tesla FSD SoC.
- Fix up the EOI function for edge triggered IRQs and keep the block
clock enabled for level IRQs in the STM32 driver.
- Fix some bits and order in the Rockchip RK3308 driver.
- Handle the errorpath in the Pistachio driver probe() properly.
* tag 'pinctrl-v5.18-2' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl:
pinctrl: pistachio: fix use of irq_of_parse_and_map()
pinctrl: stm32: Keep pinctrl block clock enabled when LEVEL IRQ requested
pinctrl: rockchip: sort the rk3308_mux_recalced_data entries
pinctrl: rockchip: fix RK3308 pinmux bits
pinctrl: stm32: Do not call stm32_gpio_get() for edge triggered IRQs in EOI
pinctrl: Fix an error in pin-function table of SP7021
pinctrl: samsung: fix missing GPIOLIB on ARM64 Exynos config
pinctrl: mediatek: moore: Fix build error
pinctrl: qcom: sm6350: fix order of UFS & SDC pins
pinctrl: alderlake: Fix register offsets for ADL-N variant
pinctrl: samsung: staticize fsd_pin_ctrl
In the commit 617d32938d1b ("rpmsg: Move the rpmsg control device
from rpmsg_char to rpmsg_ctrl"), we split the rpmsg_char driver in two.
By default give everyone who had the old driver enabled the rpmsg_ctrl
driver too.
Signed-off-by: Arnaud Pouliquen <arnaud.pouliquen@foss.st.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Link: https://lore.kernel.org/r/20220404090527.582217-1-arnaud.pouliquen@foss.st.com
Reviewed-by: Mathieu Poirier <mathieu.poirier@linaro.org>
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
This reverts commit 1a67653de0dd, which caused a boot regression.
The behavior of the "drive-push-pull" in the kernel does not
match what the binding document describes. Revert Rob's patch
to make the DT match the kernel again, rather than the binding.
Link: https://lore.kernel.org/lkml/YlVAy95eF%2F9b1nmu@orome/
Reported-by: Thierry Reding <thierry.reding@gmail.com>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
- Partly revert a change to our timer_interrupt() that caused lockups with high res
timers disabled.
- Fix a bug in KVM TCE handling that could corrupt kernel memory.
- Two commits fixing Power9/Power10 perf alternative event selection.
Thanks to: Alexey Kardashevskiy, Athira Rajeev, David Gibson, Frederic Barrat, Madhavan
Srinivasan, Miguel Ojeda, Nicholas Piggin.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEJFGtCPCthwEv2Y/bUevqPMjhpYAFAmJlSCATHG1wZUBlbGxl
cm1hbi5pZC5hdQAKCRBR6+o8yOGlgM7CD/9KH+mjtSwF3hSdun/WxMcWawdNY24g
f+eMI/vABVqN1RvmO3oC5Z1ruMUw4AxL7BMugAa/SlTjQXOyCuyHQP7vIe4ax3rZ
4TMfsRm8W4xlgI4k9d9q/unrIHko2k1OhY/wvfGMFhFdG0LWt4qJDL5vbccG5CKb
xikrutQ5+t8fNLtGH+fJVDeK9q2CU4inJRuyD4m3sfKnXygLI13l1GhcOebxN/p1
W8qBIac+YJqeezbqiwLl4BC+yXAEDixvFpTh9NuvWdoJaQHdvrltYSLQxCFMIE4B
dSp5EaBTXemalZ4F8fnGyKf4eTbtO9VIfWq3hicjfnJiFRodbFZOo7dnSpDrYlfJ
EysGdmI2HxpmWC8DgQQFv+xwZxKW/ExvPiPYb49n+j/4hKJ724wTi9Z8r3XP5fkg
lD/th40NDhe/sjCSPNWoK3l/UJb3gexd+Ict8iUp2fgNEq3FoJkTR4QlWGj6BeP3
3pOBoqmWjSXR8tWNShvyK6mLn6fclD0IA7cwTIsZZVmqI+nNR4nR0fC2Ah66Rj+R
EOof4kCBOcZ2getDyk+Hv97EFNbkDcIm6IE291Vp9hgilp0n2VnPbwwwEdexp6Jv
KpsYCHosCchnHcu7P1VFFt9w46JFSN7/euq8YZe6znFua2qhV6AGeI7H/uA2X7yL
KvuO+c+ORhrVKQ==
=xieK
-----END PGP SIGNATURE-----
Merge tag 'powerpc-5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux
Pull powerpc fixes from Michael Ellerman:
- Partly revert a change to our timer_interrupt() that caused lockups
with high res timers disabled.
- Fix a bug in KVM TCE handling that could corrupt kernel memory.
- Two commits fixing Power9/Power10 perf alternative event selection.
Thanks to Alexey Kardashevskiy, Athira Rajeev, David Gibson, Frederic
Barrat, Madhavan Srinivasan, Miguel Ojeda, and Nicholas Piggin.
* tag 'powerpc-5.18-3' of git://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux:
powerpc/perf: Fix 32bit compile
powerpc/perf: Fix power10 event alternatives
powerpc/perf: Fix power9 event alternatives
KVM: PPC: Fix TCE handling for VFIO
powerpc/time: Always set decrementer in timer_interrupt()
The ROHM BD71847 PMIC has a 32.768 kHz clock.
Describe the PMIC clock to fix the following boot errors:
bd718xx-clk bd71847-clk.1.auto: No parent clk found
bd718xx-clk: probe of bd71847-clk.1.auto failed with error -22
Based on the same fix done for imx8mm-evk as per commit
a6a355ede574 ("arm64: dts: imx8mm-evk: Add 32.768 kHz clock to PMIC")
Fixes: 3e44dd09736d ("arm64: dts: imx8mn-ddr4-evk: Add rohm,bd71847 PMIC support")
Signed-off-by: Fabio Estevam <festevam@denx.de>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
The correct spelling for the property is gpios. Otherwise, the regulator
will neither reserve nor control any GPIOs. Thus, any SD/MMC card which
can use UHS-I modes will fail.
Fixes: c2e4987e0e02 ("ARM: dts: imx6ull: add Toradex Colibri iMX6ULL support")
Signed-off-by: Max Krummenacher <max.krummenacher@toradex.com>
Signed-off-by: Denys Drozdov <denys.drozdov@toradex.com>
Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>
Signed-off-by: Shawn Guo <shawnguo@kernel.org>
-----BEGIN PGP SIGNATURE-----
iHUEABYIAB0WIQRTLbB6QfY48x44uB6AXGG7T9hjvgUCYmPu9QAKCRCAXGG7T9hj
vmvYAP4sD8jB17bSSUq9jGhNIr3vc6UZ6Oz+8R6G6CvSBhOa0gEAqWTU04RVHeYy
Sqs6qs9dvSdF2AvGR9DydrkCF5n6sgU=
=s5yA
-----END PGP SIGNATURE-----
Merge tag 'for-linus-5.18-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip
Pull xen fixes from Juergen Gross:
"A simple cleanup patch and a refcount fix for Xen on Arm"
* tag 'for-linus-5.18-rc4-tag' of git://git.kernel.org/pub/scm/linux/kernel/git/xen/tip:
arm/xen: Fix some refcount leaks
xen: Convert kmap() to kmap_local_page()
Add a struct page forward declaration to cacheflush_32.h.
Fixes this build warning:
CC drivers/crypto/xilinx/zynqmp-sha.o
In file included from arch/sparc/include/asm/cacheflush.h:11,
from include/linux/cacheflush.h:5,
from drivers/crypto/xilinx/zynqmp-sha.c:6:
arch/sparc/include/asm/cacheflush_32.h:38:37: warning: 'struct page' declared inside parameter list will not be visible outside of this definition or declaration
38 | void sparc_flush_page_to_ram(struct page *page);
Exposed by commit 0e03b8fd2936 ("crypto: xilinx - Turn SHA into a
tristate and allow COMPILE_TEST") but not Fixes: that commit because the
underlying problem is older.
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: sparclinux@vger.kernel.org
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Acked-by: Herbert Xu <herbert@gondor.apana.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
The 600M clock in the fabric is not the real reference, replace it with
a 125M clock which is the correct value for the icicle kit. Rename the
msspllclk node to mssrefclk since this is now the input to, not the
output of, the msspll clock. Control of the msspll clock has been moved
into the clock configurator, so add the register range for it to the clk
configurator. Finally, add a new output of the clock config block which
will provide the 1M reference clock for the MTIMER and the rtc.
Fixes: 528a5b1f2556 ("riscv: dts: microchip: add new peripherals to icicle kit device tree")
Fixes: 0fa6107eca41 ("RISC-V: Initial DTS for Microchip ICICLE board")
Reviewed-by: Daire McNamara <daire.mcnamara@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20220413075835.3354193-10-conor.dooley@microchip.com
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
The fic clocks passed to the pcie controller and other peripherals in
the device tree are not the clocks they actually run on. The fics are
actually clock domain crossers & the clock config blocks output is the
mss/cpu side input to the interconnect. The peripherals are actually
clocked by fixed frequency clocks embedded in the fpga fabric.
Fix the device tree so that these peripherals use the correct clocks.
The fabric side FIC0 & FIC1 inputs both use the same 125 MHz, so only
one clock is created for them.
Fixes: 528a5b1f2556 ("riscv: dts: microchip: add new peripherals to icicle kit device tree")
Reviewed-by: Daire McNamara <daire.mcnamara@microchip.com>
Signed-off-by: Conor Dooley <conor.dooley@microchip.com>
Link: https://lore.kernel.org/r/20220413075835.3354193-4-conor.dooley@microchip.com
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Signed-off-by: Stephen Boyd <sboyd@kernel.org>
* Remove 's' & 'u' as valid ISA extension
* Do not allow disabling the base extensions 'i'/'m'/'a'/'c'
x86:
* Fix NMI watchdog in guests on AMD
* Fix for SEV cache incoherency issues
* Don't re-acquire SRCU lock in complete_emulated_io()
* Avoid NULL pointer deref if VM creation fails
* Fix race conditions between APICv disabling and vCPU creation
* Bugfixes for disabling of APICv
* Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
selftests:
* Do not use bitfields larger than 32-bits, they differ between GCC and clang
-----BEGIN PGP SIGNATURE-----
iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmJi3KUUHHBib256aW5p
QHJlZGhhdC5jb20ACgkQv/vSX3jHroMhvQf/Yncfg3MkOvKsVxnCe7diKDTI/E2n
wBGNIcL8r7L9oIltHL4Mh7JQTacHFQOZ9PQ30NO1p+pznZ03e8LR59IF1JpP7VOU
sWrLZ5a4bIAEjOpA7Jxcee6hUBwewBauDgFLbb+YAI2lAahiH7jVfywDRife/c3k
N2LjeA75K8UvMiDCfjxxxerFJK91zaqjWlUNF2OhtFp/5pnMfS+nli9Q8QS837pZ
oUf+0Beb2RpSHan+wbYVU7X3ZLwtpR0M3w3uXOG+X3as56wDf26znXS02aSwa45x
lfX+pqJfmb4vCJJDXt6avH27EVgTq0Vew+BhQHG3VLRO6uxZ+smX6qmsuw==
=kvbw
-----END PGP SIGNATURE-----
Merge tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm
Pull kvm fixes from Paolo Bonzini:
"The main and larger change here is a workaround for AMD's lack of
cache coherency for encrypted-memory guests.
I have another patch pending, but it's waiting for review from the
architecture maintainers.
RISC-V:
- Remove 's' & 'u' as valid ISA extension
- Do not allow disabling the base extensions 'i'/'m'/'a'/'c'
x86:
- Fix NMI watchdog in guests on AMD
- Fix for SEV cache incoherency issues
- Don't re-acquire SRCU lock in complete_emulated_io()
- Avoid NULL pointer deref if VM creation fails
- Fix race conditions between APICv disabling and vCPU creation
- Bugfixes for disabling of APICv
- Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
selftests:
- Do not use bitfields larger than 32-bits, they differ between GCC
and clang"
* tag 'for-linus' of git://git.kernel.org/pub/scm/virt/kvm/kvm:
kvm: selftests: introduce and use more page size-related constants
kvm: selftests: do not use bitfields larger than 32-bits for PTEs
KVM: SEV: add cache flush to solve SEV cache incoherency issues
KVM: SVM: Flush when freeing encrypted pages even on SME_COHERENT CPUs
KVM: SVM: Simplify and harden helper to flush SEV guest page(s)
KVM: selftests: Silence compiler warning in the kvm_page_table_test
KVM: x86/pmu: Update AMD PMC sample period to fix guest NMI-watchdog
x86/kvm: Preserve BSP MSR_KVM_POLL_CONTROL across suspend/resume
KVM: SPDX style and spelling fixes
KVM: x86: Skip KVM_GUESTDBG_BLOCKIRQ APICv update if APICv is disabled
KVM: x86: Pend KVM_REQ_APICV_UPDATE during vCPU creation to fix a race
KVM: nVMX: Defer APICv updates while L2 is active until L1 is active
KVM: x86: Tag APICv DISABLE inhibit, not ABSENT, if APICv is disabled
KVM: Initialize debugfs_dentry when a VM is created to avoid NULL deref
KVM: Add helpers to wrap vcpu->srcu_idx and yell if it's abused
KVM: RISC-V: Use kvm_vcpu.srcu_idx, drop RISC-V's unnecessary copy
KVM: x86: Don't re-acquire SRCU lock in complete_emulated_io()
RISC-V: KVM: Restrict the extensions that can be disabled
RISC-V: KVM: Remove 's' & 'u' as valid ISA extension
Use the new compatible string "ralink,mt7621-pinctrl" for the Ralink MT7621
pinctrl subdriver on mt7621.dtsi.
Each subdriver needs to have a different compatible string. We don't want
the same compatible string to match a different subdriver's pinmux data as
it's not for our SoC.
Signed-off-by: Arınç ÜNAL <arinc.unal@arinc9.com>
Acked-by: Sergio Paracuellos <sergio.paracuellos@gmail.com>
Link: https://lore.kernel.org/r/20220414173916.5552-10-arinc.unal@arinc9.com
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
* A pair of build fixes for the recent cpuidle driver.
* A fix for systems without sv57 that manifests as a crash early in
boot.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEKzw3R0RoQ7JKlDp6LhMZ81+7GIkFAmJizQQTHHBhbG1lckBk
YWJiZWx0LmNvbQAKCRAuExnzX7sYidEWD/oC5VN4rPf1AetQeY9Zibg6sNf3X2e7
ta0vUmm1nOpAWWsxVI/JKrDCUpF6f1Vl6p6xW780SPU2pTUmAlu/3hMTaLA54Upy
J05cmLqJKRM6vaSE1frCGy6DVjGQ0iKvR9kjMkUINhs+XI/Qe5SA0OSx5e13417q
SFyo+PZ8SqzoUzKJHXvBuJpsHXM1D5T6R/XTwElYUF2QK3bA1tBScAsfB1bnWr1p
KkczI2YHrrQiIumeYal/sbtMi9MF+50Mm8tOq7RKh/dRcty1U+bDNSNemNO9BwEu
qmfz6pTwd2zUMWClt2AvZq2D5WzNVvR2sEJz6qiok+XqrYWslWhKuLjTZlRPrews
bkHCPVeBHP6wc1yHusyMgc/TumFJKxZV2PN4z/KABjrZSvta/ldHP2ykIROVidSC
J3dI5bJHA9A2EJoBeBMsH2Rbk6uYnjrUo+Ovj9Hhyjwp8JkyTqkj2nQiLaslx/yC
kOM48z1pZ0yAlF4J2meCQCVJDnbS/9PazZwVjiwg8UUpW2DcTnpI5iGFAUb3chGE
mjlyfsSx/xSYaGvrXSvbpsWBAjvgIAxSB1I4p8Ooie17w8F6WKiFzmtD9Ptr4PsF
VQP3NAJWFfXCNdMQ6MuLdw46/celhyoa2B/0VSaEFt/zSWeJamYyZ9lKiRA5tmzn
H31QFLOrWMNsZg==
=36Ir
-----END PGP SIGNATURE-----
Merge tag 'riscv-for-linus-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux
Pull RISC-V fixes Palmer Dabbelt:
- A pair of build fixes for the recent cpuidle driver
- A fix for systems without sv57 that manifests as a crash
early in boot
* tag 'riscv-for-linus-5.18-rc4' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux:
RISC-V: cpuidle: fix Kconfig select for RISCV_SBI_CPUIDLE
RISC-V: mm: Fix set_satp_mode() for platform not having Sv57
cpuidle: riscv: support non-SMP config
- Fix PMU event validation in the absence of any event counters
- Fix allmodconfig build using clang in conjunction with binutils
- Fix definitions of pXd_leaf() to handle PROT_NONE entries
- More typo fixes
-----BEGIN PGP SIGNATURE-----
iQFEBAABCgAuFiEEPxTL6PPUbjXGY88ct6xw3ITBYzQFAmJiimkQHHdpbGxAa2Vy
bmVsLm9yZwAKCRC3rHDchMFjNPVEB/9rTcl5GKh7rruMrPK2HnVCNCEzMvYJnWLz
UUd72TfsVWdEnwGRWKUSJRXEMH27Pac+yKcog8aEPzOOLB6mKOsQbLyC8X7mSpo0
hsZJLfjv2PjX0g/OnZi9Yuxqi0u+7HB5ThpKbMqbW+/tAfikqUTfuIVAC5WD2DZx
OG0IdyTdP2VL+ud0Vz/8zTyRh1kbFL82ER823dk8FytKEPGN8tIMRlv8r1YCeFQW
t1V4ZfzsfS0wZGoigOz8JDyMIzq7PNZ5cfW6Mk6wuhf32nxgJlbBWjEk0OdrMSHz
Ifv13TsIK376mMF8uFR/o8pE3UGV7y1tWoYfjq0XnSIx59bX4TFS
=J0Gy
-----END PGP SIGNATURE-----
Merge tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux
Pull arm64 fixes from Will Deacon:
"There's no real pattern to the fixes, but the main one fixes our
pmd_leaf() definition to resolve a NULL dereference on the migration
path.
- Fix PMU event validation in the absence of any event counters
- Fix allmodconfig build using clang in conjunction with binutils
- Fix definitions of pXd_leaf() to handle PROT_NONE entries
- More typo fixes"
* tag 'arm64-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux:
arm64: mm: fix p?d_leaf()
arm64: fix typos in comments
arm64: Improve HAVE_DYNAMIC_FTRACE_WITH_REGS selection for clang
arm_pmu: Validate single/group leader events
The of_find_compatible_node() function returns a node pointer with
refcount incremented, We should use of_node_put() on it when done
Add the missing of_node_put() to release the refcount.
Fixes: 9b08aaa3199a ("ARM: XEN: Move xen_early_init() before efi_init()")
Fixes: b2371587fe0c ("arm/xen: Read extended regions from DT and init Xen resource")
Signed-off-by: Miaoqian Lin <linmq006@gmail.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
These patch_text implementations are using stop_machine_cpuslocked
infrastructure with atomic cpu_count. The original idea: When the
master CPU patch_text, the others should wait for it. But current
implementation is using the first CPU as master, which couldn't
guarantee the remaining CPUs are waiting. This patch changes the
last CPU as the master to solve the potential risk.
Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
Acked-by: Palmer Dabbelt <palmer@rivosinc.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Fixes: 043cb41a85de ("riscv: introduce interfaces to patch kernel code")
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
The pmd_leaf() is used to test a leaf mapped PMD, however, it misses
the PROT_NONE mapped PMD on arm64. Fix it. A real world issue [1]
caused by this was reported by Qian Cai. Also fix pud_leaf().
Link: https://patchwork.kernel.org/comment/24798260/ [1]
Fixes: 8aa82df3c123 ("arm64: mm: add p?d_leaf() definitions")
Reported-by: Qian Cai <quic_qiancai@quicinc.com>
Signed-off-by: Muchun Song <songmuchun@bytedance.com>
Link: https://lore.kernel.org/r/20220422060033.48711-1-songmuchun@bytedance.com
Signed-off-by: Will Deacon <will@kernel.org>
There can be lots of build errors when building cpuidle-riscv-sbi.o.
They are all caused by a kconfig problem with this warning:
WARNING: unmet direct dependencies detected for RISCV_SBI_CPUIDLE
Depends on [n]: CPU_IDLE [=y] && RISCV [=y] && RISCV_SBI [=n]
Selected by [y]:
- SOC_VIRT [=y] && CPU_IDLE [=y]
so make the 'select' of RISCV_SBI_CPUIDLE also depend on RISCV_SBI.
Fixes: c5179ef1ca0c ("RISC-V: Enable RISC-V SBI CPU Idle driver for QEMU virt machine")
Signed-off-by: Randy Dunlap <rdunlap@infradead.org>
Reported-by: kernel test robot <lkp@intel.com>
Reviewed-by: Anup Patel <anup@brainfault.org>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
When Sv57 is not available the satp.MODE test in set_satp_mode() will
fail and lead to pgdir re-programming for Sv48. The pgdir re-programming
will fail as well due to pre-existing pgdir entry used for Sv57 and as
a result kernel fails to boot on RISC-V platform not having Sv57.
To fix above issue, we should clear the pgdir memory in set_satp_mode()
before re-programming.
Fixes: 011f09d12052 ("riscv: mm: Set sv57 on defaultly")
Reported-by: Mayuresh Chitale <mchitale@ventanamicro.com>
Signed-off-by: Anup Patel <apatel@ventanamicro.com>
Reviewed-by: Atish Patra <atishp@rivosinc.com>
Cc: stable@vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmer@rivosinc.com>
Flush the CPU caches when memory is reclaimed from an SEV guest (where
reclaim also includes it being unmapped from KVM's memslots). Due to lack
of coherency for SEV encrypted memory, failure to flush results in silent
data corruption if userspace is malicious/broken and doesn't ensure SEV
guest memory is properly pinned and unpinned.
Cache coherency is not enforced across the VM boundary in SEV (AMD APM
vol.2 Section 15.34.7). Confidential cachelines, generated by confidential
VM guests have to be explicitly flushed on the host side. If a memory page
containing dirty confidential cachelines was released by VM and reallocated
to another user, the cachelines may corrupt the new user at a later time.
KVM takes a shortcut by assuming all confidential memory remain pinned
until the end of VM lifetime. Therefore, KVM does not flush cache at
mmu_notifier invalidation events. Because of this incorrect assumption and
the lack of cache flushing, malicous userspace can crash the host kernel:
creating a malicious VM and continuously allocates/releases unpinned
confidential memory pages when the VM is running.
Add cache flush operations to mmu_notifier operations to ensure that any
physical memory leaving the guest VM get flushed. In particular, hook
mmu_notifier_invalidate_range_start and mmu_notifier_release events and
flush cache accordingly. The hook after releasing the mmu lock to avoid
contention with other vCPUs.
Cc: stable@vger.kernel.org
Suggested-by: Sean Christpherson <seanjc@google.com>
Reported-by: Mingwei Zhang <mizhang@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20220421031407.2516575-4-mizhang@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use clflush_cache_range() to flush the confidential memory when
SME_COHERENT is supported in AMD CPU. Cache flush is still needed since
SME_COHERENT only support cache invalidation at CPU side. All confidential
cache lines are still incoherent with DMA devices.
Cc: stable@vger.kerel.org
Fixes: add5e2f04541 ("KVM: SVM: Add support for the SEV-ES VMSA")
Reviewed-by: Sean Christopherson <seanjc@google.com>
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20220421031407.2516575-3-mizhang@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rework sev_flush_guest_memory() to explicitly handle only a single page,
and harden it to fall back to WBINVD if VM_PAGE_FLUSH fails. Per-page
flushing is currently used only to flush the VMSA, and in its current
form, the helper is completely broken with respect to flushing actual
guest memory, i.e. won't work correctly for an arbitrary memory range.
VM_PAGE_FLUSH takes a host virtual address, and is subject to normal page
walks, i.e. will fault if the address is not present in the host page
tables or does not have the correct permissions. Current AMD CPUs also
do not honor SMAP overrides (undocumented in kernel versions of the APM),
so passing in a userspace address is completely out of the question. In
other words, KVM would need to manually walk the host page tables to get
the pfn, ensure the pfn is stable, and then use the direct map to invoke
VM_PAGE_FLUSH. And the latter might not even work, e.g. if userspace is
particularly evil/clever and backs the guest with Secret Memory (which
unmaps memory from the direct map).
Signed-off-by: Sean Christopherson <seanjc@google.com>
Fixes: add5e2f04541 ("KVM: SVM: Add support for the SEV-ES VMSA")
Reported-by: Mingwei Zhang <mizhang@google.com>
Cc: stable@vger.kernel.org
Signed-off-by: Mingwei Zhang <mizhang@google.com>
Message-Id: <20220421031407.2516575-2-mizhang@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
NMI-watchdog is one of the favorite features of kernel developers,
but it does not work in AMD guest even with vPMU enabled and worse,
the system misrepresents this capability via /proc.
This is a PMC emulation error. KVM does not pass the latest valid
value to perf_event in time when guest NMI-watchdog is running, thus
the perf_event corresponding to the watchdog counter will enter the
old state at some point after the first guest NMI injection, forcing
the hardware register PMC0 to be constantly written to 0x800000000001.
Meanwhile, the running counter should accurately reflect its new value
based on the latest coordinated pmc->counter (from vPMC's point of view)
rather than the value written directly by the guest.
Fixes: 168d918f2643 ("KVM: x86: Adjust counter sample period after a wrmsr")
Reported-by: Dongli Cao <caodongli@kingsoft.com>
Signed-off-by: Like Xu <likexu@tencent.com>
Reviewed-by: Yanan Wang <wangyanan55@huawei.com>
Tested-by: Yanan Wang <wangyanan55@huawei.com>
Reviewed-by: Jim Mattson <jmattson@google.com>
Message-Id: <20220409015226.38619-1-likexu@tencent.com>
Cc: stable@vger.kernel.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
MSR_KVM_POLL_CONTROL is cleared on reset, thus reverting guests to
host-side polling after suspend/resume. Non-bootstrap CPUs are
restored correctly by the haltpoll driver because they are hot-unplugged
during suspend and hot-plugged during resume; however, the BSP
is not hotpluggable and remains in host-sde polling mode after
the guest resume. The makes the guest pay for the cost of vmexits
every time the guest enters idle.
Fix it by recording BSP's haltpoll state and resuming it during guest
resume.
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Message-Id: <1650267752-46796-1-git-send-email-wanpengli@tencent.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Skip the APICv inhibit update for KVM_GUESTDBG_BLOCKIRQ if APICv is
disabled at the module level to avoid having to acquire the mutex and
potentially process all vCPUs. The DISABLE inhibit will (barring bugs)
never be lifted, so piling on more inhibits is unnecessary.
Fixes: cae72dcc3b21 ("KVM: x86: inhibit APICv when KVM_GUESTDBG_BLOCKIRQ active")
Cc: Maxim Levitsky <mlevitsk@redhat.com>
Signed-off-by: Sean Christopherson <seanjc@google.com>
Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com>
Message-Id: <20220420013732.3308816-5-seanjc@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>