Merge 'origin/master' into perf-tools-next

To get the fixes in the perf-tools branch.  Resolved a conflict due to
RISC-V's syscall table change.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
This commit is contained in:
Namhyung Kim 2024-11-03 23:16:30 -08:00
commit aa5c90601b
1060 changed files with 12411 additions and 7637 deletions

View File

@ -73,6 +73,8 @@ Andrey Ryabinin <ryabinin.a.a@gmail.com> <aryabinin@virtuozzo.com>
Andrzej Hajda <andrzej.hajda@intel.com> <a.hajda@samsung.com>
André Almeida <andrealmeid@igalia.com> <andrealmeid@collabora.com>
Andy Adamson <andros@citi.umich.edu>
Andy Chiu <andybnac@gmail.com> <andy.chiu@sifive.com>
Andy Chiu <andybnac@gmail.com> <taochiu@synology.com>
Andy Shevchenko <andy@kernel.org> <andy@smile.org.ua>
Andy Shevchenko <andy@kernel.org> <ext-andriy.shevchenko@nokia.com>
Anilkumar Kolli <quic_akolli@quicinc.com> <akolli@codeaurora.org>
@ -197,7 +199,8 @@ Elliot Berman <quic_eberman@quicinc.com> <eberman@codeaurora.org>
Enric Balletbo i Serra <eballetbo@kernel.org> <enric.balletbo@collabora.com>
Enric Balletbo i Serra <eballetbo@kernel.org> <eballetbo@iseebcn.com>
Erik Kaneda <erik.kaneda@intel.com> <erik.schmauss@intel.com>
Eugen Hristev <eugen.hristev@collabora.com> <eugen.hristev@microchip.com>
Eugen Hristev <eugen.hristev@linaro.org> <eugen.hristev@microchip.com>
Eugen Hristev <eugen.hristev@linaro.org> <eugen.hristev@collabora.com>
Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> <ezequiel@collabora.com>
Faith Ekstrand <faith.ekstrand@collabora.com> <jason@jlekstrand.net>
@ -280,7 +283,7 @@ Jan Glauber <jan.glauber@gmail.com> <jglauber@cavium.com>
Jan Kuliga <jtkuliga.kdev@gmail.com> <jankul@alatek.krakow.pl>
Jarkko Sakkinen <jarkko@kernel.org> <jarkko.sakkinen@linux.intel.com>
Jarkko Sakkinen <jarkko@kernel.org> <jarkko@profian.com>
Jarkko Sakkinen <jarkko@kernel.org> <jarkko.sakkinen@tuni.fi>
Jarkko Sakkinen <jarkko@kernel.org> <jarkko.sakkinen@parity.io>
Jason Gunthorpe <jgg@ziepe.ca> <jgg@mellanox.com>
Jason Gunthorpe <jgg@ziepe.ca> <jgg@nvidia.com>
Jason Gunthorpe <jgg@ziepe.ca> <jgunthorpe@obsidianresearch.com>
@ -304,6 +307,11 @@ Jens Axboe <axboe@kernel.dk> <axboe@fb.com>
Jens Axboe <axboe@kernel.dk> <axboe@meta.com>
Jens Osterkamp <Jens.Osterkamp@de.ibm.com>
Jernej Skrabec <jernej.skrabec@gmail.com> <jernej.skrabec@siol.net>
Jesper Dangaard Brouer <hawk@kernel.org> <brouer@redhat.com>
Jesper Dangaard Brouer <hawk@kernel.org> <hawk@comx.dk>
Jesper Dangaard Brouer <hawk@kernel.org> <jbrouer@redhat.com>
Jesper Dangaard Brouer <hawk@kernel.org> <jdb@comx.dk>
Jesper Dangaard Brouer <hawk@kernel.org> <netoptimizer@brouer.com>
Jessica Zhang <quic_jesszhan@quicinc.com> <jesszhan@codeaurora.org>
Jilai Wang <quic_jilaiw@quicinc.com> <jilaiw@codeaurora.org>
Jiri Kosina <jikos@kernel.org> <jikos@jikos.cz>

View File

@ -223,7 +223,10 @@ are signed through the PKCS#7 message format to enforce some level of
authorization of the policies (prohibiting an attacker from gaining
unconstrained root, and deploying an "allow all" policy). These
policies must be signed by a certificate that chains to the
``SYSTEM_TRUSTED_KEYRING``. With openssl, the policy can be signed by::
``SYSTEM_TRUSTED_KEYRING``, or to the secondary and/or platform keyrings if
``CONFIG_IPE_POLICY_SIG_SECONDARY_KEYRING`` and/or
``CONFIG_IPE_POLICY_SIG_PLATFORM_KEYRING`` are enabled, respectively.
With openssl, the policy can be signed by::
openssl smime -sign \
-in "$MY_POLICY" \
@ -266,7 +269,7 @@ in the kernel. This file is write-only and accepts a PKCS#7 signed
policy. Two checks will always be performed on this policy: First, the
``policy_names`` must match with the updated version and the existing
version. Second the updated policy must have a policy version greater than
or equal to the currently-running version. This is to prevent rollback attacks.
the currently-running version. This is to prevent rollback attacks.
The ``delete`` file is used to remove a policy that is no longer needed.
This file is write-only and accepts a value of ``1`` to delete the policy.

View File

@ -425,8 +425,8 @@ This governor exposes only one tunable:
``rate_limit_us``
Minimum time (in microseconds) that has to pass between two consecutive
runs of governor computations (default: 1000 times the scaling driver's
transition latency).
runs of governor computations (default: 1.5 times the scaling driver's
transition latency or the maximum 2ms).
The purpose of this tunable is to reduce the scheduler context overhead
of the governor which might be excessive without it.
@ -474,17 +474,17 @@ This governor exposes the following tunables:
This is how often the governor's worker routine should run, in
microseconds.
Typically, it is set to values of the order of 10000 (10 ms). Its
default value is equal to the value of ``cpuinfo_transition_latency``
for each policy this governor is attached to (but since the unit here
is greater by 1000, this means that the time represented by
``sampling_rate`` is 1000 times greater than the transition latency by
default).
Typically, it is set to values of the order of 2000 (2 ms). Its
default value is to add a 50% breathing room
to ``cpuinfo_transition_latency`` on each policy this governor is
attached to. The minimum is typically the length of two scheduler
ticks.
If this tunable is per-policy, the following shell command sets the time
represented by it to be 750 times as high as the transition latency::
represented by it to be 1.5 times as high as the transition latency
(the default)::
# echo `$(($(cat cpuinfo_transition_latency) * 750 / 1000)) > ondemand/sampling_rate
# echo `$(($(cat cpuinfo_transition_latency) * 3 / 2)) > ondemand/sampling_rate
``up_threshold``
If the estimated CPU load is above this value (in percent), the governor

View File

@ -12,7 +12,10 @@ Pkeys Userspace (PKU) is a feature which can be found on:
* Intel server CPUs, Skylake and later
* Intel client CPUs, Tiger Lake (11th Gen Core) and later
* Future AMD CPUs
* arm64 CPUs implementing the Permission Overlay Extension (FEAT_S1POE)
x86_64
======
Pkeys work by dedicating 4 previously Reserved bits in each page table entry to
a "protection key", giving 16 possible keys.
@ -28,6 +31,22 @@ register. The feature is only available in 64-bit mode, even though there is
theoretically space in the PAE PTEs. These permissions are enforced on data
access only and have no effect on instruction fetches.
arm64
=====
Pkeys use 3 bits in each page table entry, to encode a "protection key index",
giving 8 possible keys.
Protections for each key are defined with a per-CPU user-writable system
register (POR_EL0). This is a 64-bit register encoding read, write and execute
overlay permissions for each protection key index.
Being a CPU register, POR_EL0 is inherently thread-local, potentially giving
each thread a different set of protections from every other thread.
Unlike x86_64, the protection key permissions also apply to instruction
fetches.
Syscalls
========
@ -38,11 +57,10 @@ There are 3 system calls which directly interact with pkeys::
int pkey_mprotect(unsigned long start, size_t len,
unsigned long prot, int pkey);
Before a pkey can be used, it must first be allocated with
pkey_alloc(). An application calls the WRPKRU instruction
directly in order to change access permissions to memory covered
with a key. In this example WRPKRU is wrapped by a C function
called pkey_set().
Before a pkey can be used, it must first be allocated with pkey_alloc(). An
application writes to the architecture specific CPU register directly in order
to change access permissions to memory covered with a key. In this example
this is wrapped by a C function called pkey_set().
::
int real_prot = PROT_READ|PROT_WRITE;
@ -64,9 +82,9 @@ is no longer in use::
munmap(ptr, PAGE_SIZE);
pkey_free(pkey);
.. note:: pkey_set() is a wrapper for the RDPKRU and WRPKRU instructions.
An example implementation can be found in
tools/testing/selftests/x86/protection_keys.c.
.. note:: pkey_set() is a wrapper around writing to the CPU register.
Example implementations can be found in
tools/testing/selftests/mm/pkey-{arm64,powerpc,x86}.h
Behavior
========
@ -96,3 +114,7 @@ with a read()::
The kernel will send a SIGSEGV in both cases, but si_code will be set
to SEGV_PKERR when violating protection keys versus SEGV_ACCERR when
the plain mprotect() permissions are violated.
Note that kernel accesses from a kthread (such as io_uring) will use a default
value for the protection key register and so will not be consistent with
userspace's value of the register or mprotect().

View File

@ -63,6 +63,16 @@ properties:
- const: sleep
power-domains:
description: |
The MediaTek DPI module is typically associated with one of the
following multimedia power domains:
POWER_DOMAIN_DISPLAY
POWER_DOMAIN_VDOSYS
POWER_DOMAIN_MM
The specific power domain used varies depending on the SoC design.
It is recommended to explicitly add the appropriate power domain
property to the DPI node in the device tree.
maxItems: 1
port:
@ -79,20 +89,6 @@ required:
- clock-names
- port
allOf:
- if:
not:
properties:
compatible:
contains:
enum:
- mediatek,mt6795-dpi
- mediatek,mt8173-dpi
- mediatek,mt8186-dpi
then:
properties:
power-domains: false
additionalProperties: false
examples:

View File

@ -38,6 +38,7 @@ properties:
description: A phandle and PM domain specifier as defined by bindings of
the power controller specified by phandle. See
Documentation/devicetree/bindings/power/power-domain.yaml for details.
maxItems: 1
mediatek,gce-client-reg:
description:
@ -57,6 +58,9 @@ properties:
clocks:
items:
- description: SPLIT Clock
- description: Used for interfacing with the HDMI RX signal source.
- description: Paired with receiving HDMI RX metadata.
minItems: 1
required:
- compatible
@ -72,9 +76,24 @@ allOf:
const: mediatek,mt8195-mdp3-split
then:
properties:
clocks:
minItems: 3
required:
- mediatek,gce-client-reg
- if:
properties:
compatible:
contains:
const: mediatek,mt8173-disp-split
then:
properties:
clocks:
maxItems: 1
additionalProperties: false
examples:

View File

@ -67,6 +67,10 @@ properties:
A 2.5V to 3.3V supply for the external reference voltage. When omitted,
the internal 2.5V reference is used.
refin-supply:
description:
A 2.5V to 3.3V supply for external reference voltage, for ad7380-4 only.
aina-supply:
description:
The common mode voltage supply for the AINA- pin on pseudo-differential
@ -135,6 +139,23 @@ allOf:
ainc-supply: false
aind-supply: false
# ad7380-4 uses refin-supply as external reference.
# All other chips from ad738x family use refio as optional external reference.
# When refio-supply is omitted, internal reference is used.
- if:
properties:
compatible:
enum:
- adi,ad7380-4
then:
properties:
refio-supply: false
required:
- refin-supply
else:
properties:
refin-supply: false
examples:
- |
#include <dt-bindings/interrupt-controller/irq.h>

View File

@ -4,7 +4,7 @@
$id: http://devicetree.org/schemas/iio/dac/adi,ad5686.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Analog Devices AD5360 and similar DACs
title: Analog Devices AD5360 and similar SPI DACs
maintainers:
- Michael Hennerich <michael.hennerich@analog.com>
@ -12,41 +12,22 @@ maintainers:
properties:
compatible:
oneOf:
- description: SPI devices
enum:
- adi,ad5310r
- adi,ad5672r
- adi,ad5674r
- adi,ad5676
- adi,ad5676r
- adi,ad5679r
- adi,ad5681r
- adi,ad5682r
- adi,ad5683
- adi,ad5683r
- adi,ad5684
- adi,ad5684r
- adi,ad5685r
- adi,ad5686
- adi,ad5686r
- description: I2C devices
enum:
- adi,ad5311r
- adi,ad5337r
- adi,ad5338r
- adi,ad5671r
- adi,ad5675r
- adi,ad5691r
- adi,ad5692r
- adi,ad5693
- adi,ad5693r
- adi,ad5694
- adi,ad5694r
- adi,ad5695r
- adi,ad5696
- adi,ad5696r
enum:
- adi,ad5310r
- adi,ad5672r
- adi,ad5674r
- adi,ad5676
- adi,ad5676r
- adi,ad5679r
- adi,ad5681r
- adi,ad5682r
- adi,ad5683
- adi,ad5683r
- adi,ad5684
- adi,ad5684r
- adi,ad5685r
- adi,ad5686
- adi,ad5686r
reg:
maxItems: 1

View File

@ -4,7 +4,7 @@
$id: http://devicetree.org/schemas/iio/dac/adi,ad5696.yaml#
$schema: http://devicetree.org/meta-schemas/core.yaml#
title: Analog Devices AD5696 and similar multi-channel DACs
title: Analog Devices AD5696 and similar I2C multi-channel DACs
maintainers:
- Michael Auchter <michael.auchter@ni.com>
@ -16,6 +16,7 @@ properties:
compatible:
enum:
- adi,ad5311r
- adi,ad5337r
- adi,ad5338r
- adi,ad5671r
- adi,ad5675r

View File

@ -26,6 +26,7 @@ properties:
- brcm,asp-v2.1-mdio
- brcm,asp-v2.2-mdio
- brcm,unimac-mdio
- brcm,bcm6846-mdio
reg:
minItems: 1

View File

@ -154,8 +154,6 @@ allOf:
- qcom,sm8550-qmp-gen4x2-pcie-phy
- qcom,sm8650-qmp-gen3x2-pcie-phy
- qcom,sm8650-qmp-gen4x2-pcie-phy
- qcom,x1e80100-qmp-gen3x2-pcie-phy
- qcom,x1e80100-qmp-gen4x2-pcie-phy
then:
properties:
clocks:
@ -171,6 +169,8 @@ allOf:
- qcom,sc8280xp-qmp-gen3x1-pcie-phy
- qcom,sc8280xp-qmp-gen3x2-pcie-phy
- qcom,sc8280xp-qmp-gen3x4-pcie-phy
- qcom,x1e80100-qmp-gen3x2-pcie-phy
- qcom,x1e80100-qmp-gen4x2-pcie-phy
- qcom,x1e80100-qmp-gen4x4-pcie-phy
then:
properties:
@ -201,6 +201,7 @@ allOf:
- qcom,sm8550-qmp-gen4x2-pcie-phy
- qcom,sm8650-qmp-gen4x2-pcie-phy
- qcom,x1e80100-qmp-gen4x2-pcie-phy
- qcom,x1e80100-qmp-gen4x4-pcie-phy
then:
properties:
resets:

View File

@ -102,21 +102,21 @@ properties:
default: 2
interrupts:
oneOf:
- minItems: 1
items:
- description: TX interrupt
- description: RX interrupt
- items:
- description: common/combined interrupt
minItems: 1
maxItems: 2
interrupt-names:
oneOf:
- minItems: 1
- description: TX interrupt
const: tx
- description: RX interrupt
const: rx
- description: TX and RX interrupts
items:
- const: tx
- const: rx
- const: common
- description: Common/combined interrupt
const: common
fck_parent:
$ref: /schemas/types.yaml#/definitions/string

View File

@ -48,6 +48,10 @@ properties:
- const: mclk_rx
- const: hclk
port:
$ref: audio-graph-port.yaml#
unevaluatedProperties: false
resets:
maxItems: 1

View File

@ -115,7 +115,7 @@ set up cache ready for use. The following script commands are available:
This mask can also be set through sysfs, eg::
echo 5 >/sys/modules/cachefiles/parameters/debug
echo 5 > /sys/module/cachefiles/parameters/debug
Starting the Cache

View File

@ -208,7 +208,7 @@ The filesystem must arrange to `cancel
such `reservations
<https://lore.kernel.org/linux-xfs/20220817093627.GZ3600936@dread.disaster.area/>`_
because writeback will not consume the reservation.
The ``iomap_file_buffered_write_punch_delalloc`` can be called from a
The ``iomap_write_delalloc_release`` can be called from a
``->iomap_end`` function to find all the clean areas of the folios
caching a fresh (``IOMAP_F_NEW``) delalloc mapping.
It takes the ``invalidate_lock``.

View File

@ -592,4 +592,3 @@ API Function Reference
.. kernel-doc:: include/linux/netfs.h
.. kernel-doc:: fs/netfs/buffered_read.c
.. kernel-doc:: fs/netfs/io.c

View File

@ -41,13 +41,22 @@ supports only 1 SDO line.
Reference voltage
-----------------
2 possible reference voltage sources are supported:
ad7380-4
~~~~~~~~
ad7380-4 supports only an external reference voltage (2.5V to 3.3V). It must be
declared in the device tree as ``refin-supply``.
All other devices from ad738x family
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All other devices from ad738x support 2 possible reference voltage sources:
- Internal reference (2.5V)
- External reference (2.5V to 3.3V)
The source is determined by the device tree. If ``refio-supply`` is present,
then the external reference is used, else the internal reference is used.
then it is used as external reference, else the internal reference is used.
Oversampling and resolution boost
---------------------------------

View File

@ -7,26 +7,26 @@ The DAMON subsystem covers the files that are listed in 'DATA ACCESS MONITOR'
section of 'MAINTAINERS' file.
The mailing lists for the subsystem are damon@lists.linux.dev and
linux-mm@kvack.org. Patches should be made against the mm-unstable `tree
<https://git.kernel.org/akpm/mm/h/mm-unstable>` whenever possible and posted to
the mailing lists.
linux-mm@kvack.org. Patches should be made against the `mm-unstable tree
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ whenever possible and posted
to the mailing lists.
SCM Trees
---------
There are multiple Linux trees for DAMON development. Patches under
development or testing are queued in `damon/next
<https://git.kernel.org/sj/h/damon/next>` by the DAMON maintainer.
<https://git.kernel.org/sj/h/damon/next>`_ by the DAMON maintainer.
Sufficiently reviewed patches will be queued in `mm-unstable
<https://git.kernel.org/akpm/mm/h/mm-unstable>` by the memory management
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ by the memory management
subsystem maintainer. After more sufficient tests, the patches will be queued
in `mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>` , and finally
in `mm-stable <https://git.kernel.org/akpm/mm/h/mm-stable>`_, and finally
pull-requested to the mainline by the memory management subsystem maintainer.
Note again the patches for mm-unstable `tree
<https://git.kernel.org/akpm/mm/h/mm-unstable>` are queued by the memory
Note again the patches for `mm-unstable tree
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ are queued by the memory
management subsystem maintainer. If the patches requires some patches in
damon/next `tree <https://git.kernel.org/sj/h/damon/next>` which not yet merged
`damon/next tree <https://git.kernel.org/sj/h/damon/next>`_ which not yet merged
in mm-unstable, please make sure the requirement is clearly specified.
Submit checklist addendum
@ -37,25 +37,25 @@ When making DAMON changes, you should do below.
- Build changes related outputs including kernel and documents.
- Ensure the builds introduce no new errors or warnings.
- Run and ensure no new failures for DAMON `selftests
<https://github.com/awslabs/damon-tests/blob/master/corr/run.sh#L49>` and
<https://github.com/damonitor/damon-tests/blob/master/corr/run.sh#L49>`_ and
`kunittests
<https://github.com/awslabs/damon-tests/blob/master/corr/tests/kunit.sh>`.
<https://github.com/damonitor/damon-tests/blob/master/corr/tests/kunit.sh>`_.
Further doing below and putting the results will be helpful.
- Run `damon-tests/corr
<https://github.com/awslabs/damon-tests/tree/master/corr>` for normal
<https://github.com/damonitor/damon-tests/tree/master/corr>`_ for normal
changes.
- Run `damon-tests/perf
<https://github.com/awslabs/damon-tests/tree/master/perf>` for performance
<https://github.com/damonitor/damon-tests/tree/master/perf>`_ for performance
changes.
Key cycle dates
---------------
Patches can be sent anytime. Key cycle dates of the `mm-unstable
<https://git.kernel.org/akpm/mm/h/mm-unstable>` and `mm-stable
<https://git.kernel.org/akpm/mm/h/mm-stable>` trees depend on the memory
<https://git.kernel.org/akpm/mm/h/mm-unstable>`_ and `mm-stable
<https://git.kernel.org/akpm/mm/h/mm-stable>`_ trees depend on the memory
management subsystem maintainer.
Review cadence
@ -72,13 +72,13 @@ Mailing tool
Like many other Linux kernel subsystems, DAMON uses the mailing lists
(damon@lists.linux.dev and linux-mm@kvack.org) as the major communication
channel. There is a simple tool called `HacKerMaiL
<https://github.com/damonitor/hackermail>` (``hkml``), which is for people who
<https://github.com/damonitor/hackermail>`_ (``hkml``), which is for people who
are not very familiar with the mailing lists based communication. The tool
could be particularly helpful for DAMON community members since it is developed
and maintained by DAMON maintainer. The tool is also officially announced to
support DAMON and general Linux kernel development workflow.
In other words, `hkml <https://github.com/damonitor/hackermail>` is a mailing
In other words, `hkml <https://github.com/damonitor/hackermail>`_ is a mailing
tool for DAMON community, which DAMON maintainer is committed to support.
Please feel free to try and report issues or feature requests for the tool to
the maintainer.
@ -98,8 +98,8 @@ slots, and attendees should reserve one of those at least 24 hours before the
time slot, by reaching out to the maintainer.
Schedules and available reservation time slots are available at the Google `doc
<https://docs.google.com/document/d/1v43Kcj3ly4CYqmAkMaZzLiM2GEnWfgdGbZAH3mi2vpM/edit?usp=sharing>`.
<https://docs.google.com/document/d/1v43Kcj3ly4CYqmAkMaZzLiM2GEnWfgdGbZAH3mi2vpM/edit?usp=sharing>`_.
There is also a public Google `calendar
<https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`
<https://calendar.google.com/calendar/u/0?cid=ZDIwOTA4YTMxNjc2MDQ3NTIyMmUzYTM5ZmQyM2U4NDA0ZGIwZjBiYmJlZGQxNDM0MmY4ZTRjOTE0NjdhZDRiY0Bncm91cC5jYWxlbmRhci5nb29nbGUuY29t>`_
that has the events. Anyone can subscribe it. DAMON maintainer will also
provide periodic reminder to the mailing list (damon@lists.linux.dev).

View File

@ -16,7 +16,7 @@ ii) transmit network traffic, or any other that needs raw
Howto can be found at:
https://sites.google.com/site/packetmmap/
https://web.archive.org/web/20220404160947/https://sites.google.com/site/packetmmap/
Please send your comments to
- Ulisses Alonso Camaró <uaca@i.hate.spam.alumni.uv.es>
@ -166,7 +166,8 @@ As capture, each frame contains two parts::
/* bind socket to eth0 */
bind(this->socket, (struct sockaddr *)&my_addr, sizeof(struct sockaddr_ll));
A complete tutorial is available at: https://sites.google.com/site/packetmmap/
A complete tutorial is available at:
https://web.archive.org/web/20220404160947/https://sites.google.com/site/packetmmap/
By default, the user should put data at::

View File

@ -30,10 +30,13 @@ tree as a dedicated branch covering multiple subsystems.
The main SoC tree is housed on git.kernel.org:
https://git.kernel.org/pub/scm/linux/kernel/git/soc/soc.git/
Maintainers
-----------
Clearly this is quite a wide range of topics, which no one person, or even
small group of people are capable of maintaining. Instead, the SoC subsystem
is comprised of many submaintainers, each taking care of individual platforms
and driver subdirectories.
is comprised of many submaintainers (platform maintainers), each taking care of
individual platforms and driver subdirectories.
In this regard, "platform" usually refers to a series of SoCs from a given
vendor, for example, Nvidia's series of Tegra SoCs. Many submaintainers operate
on a vendor level, responsible for multiple product lines. For several reasons,
@ -43,14 +46,43 @@ MAINTAINERS file.
Most of these submaintainers have their own trees where they stage patches,
sending pull requests to the main SoC tree. These trees are usually, but not
always, listed in MAINTAINERS. The main SoC maintainers can be reached via the
alias soc@kernel.org if there is no platform-specific maintainer, or if they
are unresponsive.
always, listed in MAINTAINERS.
What the SoC tree is not, however, is a location for architecture-specific code
changes. Each architecture has its own maintainers that are responsible for
architectural details, CPU errata and the like.
Submitting Patches for Given SoC
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
All typical platform related patches should be sent via SoC submaintainers
(platform-specific maintainers). This includes also changes to per-platform or
shared defconfigs (scripts/get_maintainer.pl might not provide correct
addresses in such case).
Submitting Patches to the Main SoC Maintainers
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The main SoC maintainers can be reached via the alias soc@kernel.org only in
following cases:
1. There are no platform-specific maintainers.
2. Platform-specific maintainers are unresponsive.
3. Introducing a completely new SoC platform. Such new SoC work should be sent
first to common mailing lists, pointed out by scripts/get_maintainer.pl, for
community review. After positive community review, work should be sent to
soc@kernel.org in one patchset containing new arch/foo/Kconfig entry, DTS
files, MAINTAINERS file entry and optionally initial drivers with their
Devicetree bindings. The MAINTAINERS file entry should list new
platform-specific maintainers, who are going to be responsible for handling
patches for the platform from now on.
Note that the soc@kernel.org is usually not the place to discuss the patches,
thus work sent to this address should be already considered as acceptable by
the community.
Information for (new) Submaintainers
------------------------------------

View File

@ -17,7 +17,7 @@ Architecture Level of support Constraints
============= ================ ==============================================
``arm64`` Maintained Little Endian only.
``loongarch`` Maintained \-
``riscv`` Maintained ``riscv64`` only.
``riscv`` Maintained ``riscv64`` and LLVM/Clang only.
``um`` Maintained \-
``x86`` Maintained ``x86_64`` only.
============= ================ ==============================================

View File

@ -23,177 +23,166 @@ applications can additionally seal security critical data at runtime.
A similar feature already exists in the XNU kernel with the
VM_FLAGS_PERMANENT flag [1] and on OpenBSD with the mimmutable syscall [2].
User API
========
mseal()
-----------
The mseal() syscall has the following signature:
SYSCALL
=======
mseal syscall signature
-----------------------
``int mseal(void \* addr, size_t len, unsigned long flags)``
``int mseal(void addr, size_t len, unsigned long flags)``
**addr**/**len**: virtual memory address range.
The address range set by **addr**/**len** must meet:
- The start address must be in an allocated VMA.
- The start address must be page aligned.
- The end address (**addr** + **len**) must be in an allocated VMA.
- no gap (unallocated memory) between start and end address.
**addr/len**: virtual memory address range.
The ``len`` will be paged aligned implicitly by the kernel.
The address range set by ``addr``/``len`` must meet:
- The start address must be in an allocated VMA.
- The start address must be page aligned.
- The end address (``addr`` + ``len``) must be in an allocated VMA.
- no gap (unallocated memory) between start and end address.
**flags**: reserved for future use.
The ``len`` will be paged aligned implicitly by the kernel.
**Return values**:
- **0**: Success.
- **-EINVAL**:
* Invalid input ``flags``.
* The start address (``addr``) is not page aligned.
* Address range (``addr`` + ``len``) overflow.
- **-ENOMEM**:
* The start address (``addr``) is not allocated.
* The end address (``addr`` + ``len``) is not allocated.
* A gap (unallocated memory) between start and end address.
- **-EPERM**:
* sealing is supported only on 64-bit CPUs, 32-bit is not supported.
**flags**: reserved for future use.
**Note about error return**:
- For above error cases, users can expect the given memory range is
unmodified, i.e. no partial update.
- There might be other internal errors/cases not listed here, e.g.
error during merging/splitting VMAs, or the process reaching the maximum
number of supported VMAs. In those cases, partial updates to the given
memory range could happen. However, those cases should be rare.
**return values**:
**Architecture support**:
mseal only works on 64-bit CPUs, not 32-bit CPUs.
- ``0``: Success.
**Idempotent**:
users can call mseal multiple times. mseal on an already sealed memory
is a no-action (not error).
- ``-EINVAL``:
- Invalid input ``flags``.
- The start address (``addr``) is not page aligned.
- Address range (``addr`` + ``len``) overflow.
**no munseal**
Once mapping is sealed, it can't be unsealed. The kernel should never
have munseal, this is consistent with other sealing feature, e.g.
F_SEAL_SEAL for file.
- ``-ENOMEM``:
- The start address (``addr``) is not allocated.
- The end address (``addr`` + ``len``) is not allocated.
- A gap (unallocated memory) between start and end address.
Blocked mm syscall for sealed mapping
-------------------------------------
It might be important to note: **once the mapping is sealed, it will
stay in the process's memory until the process terminates**.
- ``-EPERM``:
- sealing is supported only on 64-bit CPUs, 32-bit is not supported.
Example::
- For above error cases, users can expect the given memory range is
unmodified, i.e. no partial update.
*ptr = mmap(0, 4096, PROT_READ, MAP_ANONYMOUS | MAP_PRIVATE, 0, 0);
rc = mseal(ptr, 4096, 0);
/* munmap will fail */
rc = munmap(ptr, 4096);
assert(rc < 0);
- There might be other internal errors/cases not listed here, e.g.
error during merging/splitting VMAs, or the process reaching the max
number of supported VMAs. In those cases, partial updates to the given
memory range could happen. However, those cases should be rare.
Blocked mm syscall:
- munmap
- mmap
- mremap
- mprotect and pkey_mprotect
- some destructive madvise behaviors: MADV_DONTNEED, MADV_FREE,
MADV_DONTNEED_LOCKED, MADV_FREE, MADV_DONTFORK, MADV_WIPEONFORK
**Blocked operations after sealing**:
Unmapping, moving to another location, and shrinking the size,
via munmap() and mremap(), can leave an empty space, therefore
can be replaced with a VMA with a new set of attributes.
The first set of syscalls to block is munmap, mremap, mmap. They can
either leave an empty space in the address space, therefore allowing
replacement with a new mapping with new set of attributes, or can
overwrite the existing mapping with another mapping.
Moving or expanding a different VMA into the current location,
via mremap().
mprotect and pkey_mprotect are blocked because they changes the
protection bits (RWX) of the mapping.
Modifying a VMA via mmap(MAP_FIXED).
Certain destructive madvise behaviors, specifically MADV_DONTNEED,
MADV_FREE, MADV_DONTNEED_LOCKED, and MADV_WIPEONFORK, can introduce
risks when applied to anonymous memory by threads lacking write
permissions. Consequently, these operations are prohibited under such
conditions. The aforementioned behaviors have the potential to modify
region contents by discarding pages, effectively performing a memset(0)
operation on the anonymous memory.
Size expansion, via mremap(), does not appear to pose any
specific risks to sealed VMAs. It is included anyway because
the use case is unclear. In any case, users can rely on
merging to expand a sealed VMA.
Kernel will return -EPERM for blocked syscalls.
mprotect() and pkey_mprotect().
When blocked syscall return -EPERM due to sealing, the memory regions may
or may not be changed, depends on the syscall being blocked:
Some destructive madvice() behaviors (e.g. MADV_DONTNEED)
for anonymous memory, when users don't have write permission to the
memory. Those behaviors can alter region contents by discarding pages,
effectively a memset(0) for anonymous memory.
- munmap: munmap is atomic. If one of VMAs in the given range is
sealed, none of VMAs are updated.
- mprotect, pkey_mprotect, madvise: partial update might happen, e.g.
when mprotect over multiple VMAs, mprotect might update the beginning
VMAs before reaching the sealed VMA and return -EPERM.
- mmap and mremap: undefined behavior.
Kernel will return -EPERM for blocked operations.
For blocked operations, one can expect the given address is unmodified,
i.e. no partial update. Note, this is different from existing mm
system call behaviors, where partial updates are made till an error is
found and returned to userspace. To give an example:
Assume following code sequence:
- ptr = mmap(null, 8192, PROT_NONE);
- munmap(ptr + 4096, 4096);
- ret1 = mprotect(ptr, 8192, PROT_READ);
- mseal(ptr, 4096);
- ret2 = mprotect(ptr, 8192, PROT_NONE);
ret1 will be -ENOMEM, the page from ptr is updated to PROT_READ.
ret2 will be -EPERM, the page remains to be PROT_READ.
**Note**:
- mseal() only works on 64-bit CPUs, not 32-bit CPU.
- users can call mseal() multiple times, mseal() on an already sealed memory
is a no-action (not error).
- munseal() is not supported.
Use cases:
==========
Use cases
=========
- glibc:
The dynamic linker, during loading ELF executables, can apply sealing to
non-writable memory segments.
mapping segments.
- Chrome browser: protect some security sensitive data-structures.
- Chrome browser: protect some security sensitive data structures.
Notes on which memory to seal:
==============================
It might be important to note that sealing changes the lifetime of a mapping,
i.e. the sealed mapping wont be unmapped till the process terminates or the
exec system call is invoked. Applications can apply sealing to any virtual
memory region from userspace, but it is crucial to thoroughly analyze the
mapping's lifetime prior to apply the sealing.
When not to use mseal
=====================
Applications can apply sealing to any virtual memory region from userspace,
but it is *crucial to thoroughly analyze the mapping's lifetime* prior to
apply the sealing. This is because the sealed mapping *wont be unmapped*
until the process terminates or the exec system call is invoked.
For example:
- aio/shm
aio/shm can call mmap and munmap on behalf of userspace, e.g.
ksys_shmdt() in shm.c. The lifetimes of those mapping are not tied to
the lifetime of the process. If those memories are sealed from userspace,
then munmap will fail, causing leaks in VMA address space during the
lifetime of the process.
- aio/shm
- ptr allocated by malloc (heap)
Don't use mseal on the memory ptr return from malloc().
malloc() is implemented by allocator, e.g. by glibc. Heap manager might
allocate a ptr from brk or mapping created by mmap.
If an app calls mseal on a ptr returned from malloc(), this can affect
the heap manager's ability to manage the mappings; the outcome is
non-deterministic.
aio/shm can call mmap()/munmap() on behalf of userspace, e.g. ksys_shmdt() in
shm.c. The lifetime of those mapping are not tied to the lifetime of the
process. If those memories are sealed from userspace, then munmap() will fail,
causing leaks in VMA address space during the lifetime of the process.
Example::
- Brk (heap)
ptr = malloc(size);
/* don't call mseal on ptr return from malloc. */
mseal(ptr, size);
/* free will success, allocator can't shrink heap lower than ptr */
free(ptr);
Currently, userspace applications can seal parts of the heap by calling
malloc() and mseal().
let's assume following calls from user space:
mseal doesn't block
===================
In a nutshell, mseal blocks certain mm syscall from modifying some of VMA's
attributes, such as protection bits (RWX). Sealed mappings doesn't mean the
memory is immutable.
- ptr = malloc(size);
- mprotect(ptr, size, RO);
- mseal(ptr, size);
- free(ptr);
Technically, before mseal() is added, the user can change the protection of
the heap by calling mprotect(RO). As long as the user changes the protection
back to RW before free(), the memory range can be reused.
Adding mseal() into the picture, however, the heap is then sealed partially,
the user can still free it, but the memory remains to be RO. If the address
is re-used by the heap manager for another malloc, the process might crash
soon after. Therefore, it is important not to apply sealing to any memory
that might get recycled.
Furthermore, even if the application never calls the free() for the ptr,
the heap manager may invoke the brk system call to shrink the size of the
heap. In the kernel, the brk-shrink will call munmap(). Consequently,
depending on the location of the ptr, the outcome of brk-shrink is
nondeterministic.
Additional notes:
=================
As Jann Horn pointed out in [3], there are still a few ways to write
to RO memory, which is, in a way, by design. Those cases are not covered
by mseal(). If applications want to block such cases, sandbox tools (such as
seccomp, LSM, etc) might be considered.
to RO memory, which is, in a way, by design. And those could be blocked
by different security measures.
Those cases are:
- Write to read-only memory through /proc/self/mem interface.
- Write to read-only memory through ptrace (such as PTRACE_POKETEXT).
- userfaultfd.
- Write to read-only memory through /proc/self/mem interface (FOLL_FORCE).
- Write to read-only memory through ptrace (such as PTRACE_POKETEXT).
- userfaultfd.
The idea that inspired this patch comes from Stephen Röttgers work in V8
CFI [4]. Chrome browser in ChromeOS will be the first user of this API.
Reference:
==========
[1] https://github.com/apple-oss-distributions/xnu/blob/1031c584a5e37aff177559b9f69dbd3c8c3fd30a/osfmk/mach/vm_statistics.h#L274
[2] https://man.openbsd.org/mimmutable.2
[3] https://lore.kernel.org/lkml/CAG48ez3ShUYey+ZAFsU2i1RpQn0a5eOs2hzQ426FkcgnfUGLvA@mail.gmail.com
[4] https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyXgeaRHo/edit#heading=h.bvaojj9fu6hc
Reference
=========
- [1] https://github.com/apple-oss-distributions/xnu/blob/1031c584a5e37aff177559b9f69dbd3c8c3fd30a/osfmk/mach/vm_statistics.h#L274
- [2] https://man.openbsd.org/mimmutable.2
- [3] https://lore.kernel.org/lkml/CAG48ez3ShUYey+ZAFsU2i1RpQn0a5eOs2hzQ426FkcgnfUGLvA@mail.gmail.com
- [4] https://docs.google.com/document/d/1O2jwK4dxI3nRcOJuPYkonhTkNQfbmwdvxQMyXgeaRHo/edit#heading=h.bvaojj9fu6hc

View File

@ -8098,13 +8098,15 @@ KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS By default, KVM emulates MONITOR/MWAIT (if
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT is
disabled.
KVM_X86_QUIRK_SLOT_ZAP_ALL By default, KVM invalidates all SPTEs in
fast way for memslot deletion when VM type
is KVM_X86_DEFAULT_VM.
When this quirk is disabled or when VM type
is other than KVM_X86_DEFAULT_VM, KVM zaps
only leaf SPTEs that are within the range of
the memslot being deleted.
KVM_X86_QUIRK_SLOT_ZAP_ALL By default, for KVM_X86_DEFAULT_VM VMs, KVM
invalidates all SPTEs in all memslots and
address spaces when a memslot is deleted or
moved. When this quirk is disabled (or the
VM type isn't KVM_X86_DEFAULT_VM), KVM only
ensures the backing memory of the deleted
or moved memslot isn't reachable, i.e KVM
_may_ invalidate only SPTEs related to the
memslot.
=================================== ============================================
7.32 KVM_CAP_MAX_VCPU_ID

View File

@ -136,7 +136,7 @@ For direct sp, we can easily avoid it since the spte of direct sp is fixed
to gfn. For indirect sp, we disabled fast page fault for simplicity.
A solution for indirect sp could be to pin the gfn, for example via
kvm_vcpu_gfn_to_pfn_atomic, before the cmpxchg. After the pinning:
gfn_to_pfn_memslot_atomic, before the cmpxchg. After the pinning:
- We have held the refcount of pfn; that means the pfn can not be freed and
be reused for another gfn.

View File

@ -258,12 +258,6 @@ L: linux-acenic@sunsite.dk
S: Maintained
F: drivers/net/ethernet/alteon/acenic*
ACER ASPIRE 1 EMBEDDED CONTROLLER DRIVER
M: Nikita Travkin <nikita@trvn.ru>
S: Maintained
F: Documentation/devicetree/bindings/platform/acer,aspire1-ec.yaml
F: drivers/platform/arm64/acer-aspire1-ec.c
ACER ASPIRE ONE TEMPERATURE AND FAN DRIVER
M: Peter Kaestle <peter@piie.net>
L: platform-driver-x86@vger.kernel.org
@ -888,7 +882,6 @@ F: drivers/staging/media/sunxi/cedrus/
ALPHA PORT
M: Richard Henderson <richard.henderson@linaro.org>
M: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
M: Matt Turner <mattst88@gmail.com>
L: linux-alpha@vger.kernel.org
S: Odd Fixes
@ -1761,8 +1754,8 @@ F: include/uapi/linux/if_arcnet.h
ARM AND ARM64 SoC SUB-ARCHITECTURES (COMMON PARTS)
M: Arnd Bergmann <arnd@arndb.de>
M: Olof Johansson <olof@lixom.net>
M: soc@kernel.org
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
L: soc@lists.linux.dev
S: Maintained
P: Documentation/process/maintainer-soc.rst
C: irc://irc.libera.chat/armlinux
@ -2263,12 +2256,6 @@ L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Maintained
F: arch/arm/mach-ep93xx/ts72xx.c
ARM/CIRRUS LOGIC CLPS711X ARM ARCHITECTURE
M: Alexander Shiyan <shc_work@mail.ru>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Odd Fixes
N: clps711x
ARM/CIRRUS LOGIC EP93XX ARM ARCHITECTURE
M: Hartley Sweeten <hsweeten@visionengravers.com>
M: Alexander Sverdlin <alexander.sverdlin@gmail.com>
@ -3815,14 +3802,6 @@ F: drivers/video/backlight/
F: include/linux/backlight.h
F: include/linux/pwm_backlight.h
BAIKAL-T1 PVT HARDWARE MONITOR DRIVER
M: Serge Semin <fancer.lancer@gmail.com>
L: linux-hwmon@vger.kernel.org
S: Supported
F: Documentation/devicetree/bindings/hwmon/baikal,bt1-pvt.yaml
F: Documentation/hwmon/bt1-pvt.rst
F: drivers/hwmon/bt1-pvt.[ch]
BARCO P50 GPIO DRIVER
M: Santosh Kumar Yadav <santoshkumar.yadav@barco.com>
M: Peter Korsgaard <peter.korsgaard@barco.com>
@ -6476,7 +6455,6 @@ F: drivers/mtd/nand/raw/denali*
DESIGNWARE EDMA CORE IP DRIVER
M: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
R: Serge Semin <fancer.lancer@gmail.com>
L: dmaengine@vger.kernel.org
S: Maintained
F: drivers/dma/dw-edma/
@ -9745,6 +9723,7 @@ F: include/dt-bindings/gpio/
F: include/linux/gpio.h
F: include/linux/gpio/
F: include/linux/of_gpio.h
K: (devm_)?gpio_(request|free|direction|get|set)
GPIO UAPI
M: Bartosz Golaszewski <brgl@bgdev.pl>
@ -9759,14 +9738,6 @@ F: drivers/gpio/gpiolib-cdev.c
F: include/uapi/linux/gpio.h
F: tools/gpio/
GRE DEMULTIPLEXER DRIVER
M: Dmitry Kozlov <xeb@mail.ru>
L: netdev@vger.kernel.org
S: Maintained
F: include/net/gre.h
F: net/ipv4/gre_demux.c
F: net/ipv4/gre_offload.c
GRETH 10/100/1G Ethernet MAC device driver
M: Andreas Larsson <andreas@gaisler.com>
L: netdev@vger.kernel.org
@ -11283,10 +11254,10 @@ F: security/integrity/
F: security/integrity/ima/
INTEGRITY POLICY ENFORCEMENT (IPE)
M: Fan Wu <wufan@linux.microsoft.com>
M: Fan Wu <wufan@kernel.org>
L: linux-security-module@vger.kernel.org
S: Supported
T: git https://github.com/microsoft/ipe.git
T: git git://git.kernel.org/pub/scm/linux/kernel/git/wufan/ipe.git
F: Documentation/admin-guide/LSM/ipe.rst
F: Documentation/security/ipe.rst
F: scripts/ipe/
@ -11604,6 +11575,16 @@ F: drivers/crypto/intel/keembay/keembay-ocs-hcu-core.c
F: drivers/crypto/intel/keembay/ocs-hcu.c
F: drivers/crypto/intel/keembay/ocs-hcu.h
INTEL LA JOLLA COVE ADAPTER (LJCA) USB I/O EXPANDER DRIVERS
M: Wentong Wu <wentong.wu@intel.com>
M: Sakari Ailus <sakari.ailus@linux.intel.com>
S: Maintained
F: drivers/gpio/gpio-ljca.c
F: drivers/i2c/busses/i2c-ljca.c
F: drivers/spi/spi-ljca.c
F: drivers/usb/misc/usb-ljca.c
F: include/linux/usb/ljca.h
INTEL MANAGEMENT ENGINE (mei)
M: Tomas Winkler <tomas.winkler@intel.com>
L: linux-kernel@vger.kernel.org
@ -12242,6 +12223,7 @@ R: Dmitry Vyukov <dvyukov@google.com>
R: Vincenzo Frascino <vincenzo.frascino@arm.com>
L: kasan-dev@googlegroups.com
S: Maintained
B: https://bugzilla.kernel.org/buglist.cgi?component=Sanitizers&product=Memory%20Management
F: Documentation/dev-tools/kasan.rst
F: arch/*/include/asm/*kasan.h
F: arch/*/mm/kasan_init*
@ -12265,6 +12247,7 @@ R: Dmitry Vyukov <dvyukov@google.com>
R: Andrey Konovalov <andreyknvl@gmail.com>
L: kasan-dev@googlegroups.com
S: Maintained
B: https://bugzilla.kernel.org/buglist.cgi?component=Sanitizers&product=Memory%20Management
F: Documentation/dev-tools/kcov.rst
F: include/linux/kcov.h
F: include/uapi/linux/kcov.h
@ -12947,12 +12930,6 @@ S: Maintained
F: drivers/ata/pata_arasan_cf.c
F: include/linux/pata_arasan_cf_data.h
LIBATA PATA DRIVERS
R: Sergey Shtylyov <s.shtylyov@omp.ru>
L: linux-ide@vger.kernel.org
F: drivers/ata/ata_*.c
F: drivers/ata/pata_*.c
LIBATA PATA FARADAY FTIDE010 AND GEMINI SATA BRIDGE DRIVERS
M: Linus Walleij <linus.walleij@linaro.org>
L: linux-ide@vger.kernel.org
@ -12969,14 +12946,6 @@ F: drivers/ata/ahci_platform.c
F: drivers/ata/libahci_platform.c
F: include/linux/ahci_platform.h
LIBATA SATA AHCI SYNOPSYS DWC CONTROLLER DRIVER
M: Serge Semin <fancer.lancer@gmail.com>
L: linux-ide@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/ata/baikal,bt1-ahci.yaml
F: Documentation/devicetree/bindings/ata/snps,dwc-ahci.yaml
F: drivers/ata/ahci_dwc.c
LIBATA SATA PROMISE TX2/TX4 CONTROLLER DRIVER
M: Mikael Pettersson <mikpelinux@gmail.com>
L: linux-ide@vger.kernel.org
@ -14173,8 +14142,7 @@ T: git git://linuxtv.org/media_tree.git
F: drivers/media/platform/nxp/imx-pxp.[ch]
MEDIA DRIVERS FOR ASCOT2E
M: Sergey Kozlov <serjk@netup.ru>
M: Abylay Ospan <aospan@netup.ru>
M: Abylay Ospan <aospan@amazon.com>
L: linux-media@vger.kernel.org
S: Supported
W: https://linuxtv.org
@ -14191,8 +14159,7 @@ T: git git://linuxtv.org/media_tree.git
F: drivers/media/dvb-frontends/cxd2099*
MEDIA DRIVERS FOR CXD2841ER
M: Sergey Kozlov <serjk@netup.ru>
M: Abylay Ospan <aospan@netup.ru>
M: Abylay Ospan <aospan@amazon.com>
L: linux-media@vger.kernel.org
S: Supported
W: https://linuxtv.org
@ -14245,7 +14212,7 @@ F: drivers/media/platform/nxp/imx7-media-csi.c
F: drivers/media/platform/nxp/imx8mq-mipi-csi2.c
MEDIA DRIVERS FOR HELENE
M: Abylay Ospan <aospan@netup.ru>
M: Abylay Ospan <aospan@amazon.com>
L: linux-media@vger.kernel.org
S: Supported
W: https://linuxtv.org
@ -14254,8 +14221,7 @@ T: git git://linuxtv.org/media_tree.git
F: drivers/media/dvb-frontends/helene*
MEDIA DRIVERS FOR HORUS3A
M: Sergey Kozlov <serjk@netup.ru>
M: Abylay Ospan <aospan@netup.ru>
M: Abylay Ospan <aospan@amazon.com>
L: linux-media@vger.kernel.org
S: Supported
W: https://linuxtv.org
@ -14264,8 +14230,7 @@ T: git git://linuxtv.org/media_tree.git
F: drivers/media/dvb-frontends/horus3a*
MEDIA DRIVERS FOR LNBH25
M: Sergey Kozlov <serjk@netup.ru>
M: Abylay Ospan <aospan@netup.ru>
M: Abylay Ospan <aospan@amazon.com>
L: linux-media@vger.kernel.org
S: Supported
W: https://linuxtv.org
@ -14281,8 +14246,7 @@ T: git git://linuxtv.org/media_tree.git
F: drivers/media/dvb-frontends/mxl5xx*
MEDIA DRIVERS FOR NETUP PCI UNIVERSAL DVB devices
M: Sergey Kozlov <serjk@netup.ru>
M: Abylay Ospan <aospan@netup.ru>
M: Abylay Ospan <aospan@amazon.com>
L: linux-media@vger.kernel.org
S: Supported
W: https://linuxtv.org
@ -14907,9 +14871,10 @@ N: include/linux/page[-_]*
MEMORY MAPPING
M: Andrew Morton <akpm@linux-foundation.org>
R: Liam R. Howlett <Liam.Howlett@oracle.com>
M: Liam R. Howlett <Liam.Howlett@oracle.com>
M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
R: Vlastimil Babka <vbabka@suse.cz>
R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
R: Jann Horn <jannh@google.com>
L: linux-mm@kvack.org
S: Maintained
W: http://www.linux-mm.org
@ -14932,13 +14897,6 @@ F: drivers/mtd/
F: include/linux/mtd/
F: include/uapi/mtd/
MEMSENSING MICROSYSTEMS MSA311 DRIVER
M: Dmitry Rokosov <ddrokosov@sberdevices.ru>
L: linux-iio@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/iio/accel/memsensing,msa311.yaml
F: drivers/iio/accel/msa311.c
MEN A21 WATCHDOG DRIVER
M: Johannes Thumshirn <morbidrsa@gmail.com>
L: linux-watchdog@vger.kernel.org
@ -15083,6 +15041,7 @@ F: drivers/spi/spi-at91-usart.c
MICROCHIP AUDIO ASOC DRIVERS
M: Claudiu Beznea <claudiu.beznea@tuxon.dev>
M: Andrei Simion <andrei.simion@microchip.com>
L: linux-sound@vger.kernel.org
S: Supported
F: Documentation/devicetree/bindings/sound/atmel*
@ -15191,6 +15150,7 @@ F: include/video/atmel_lcdc.h
MICROCHIP MCP16502 PMIC DRIVER
M: Claudiu Beznea <claudiu.beznea@tuxon.dev>
M: Andrei Simion <andrei.simion@microchip.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Supported
F: Documentation/devicetree/bindings/regulator/microchip,mcp16502.yaml
@ -15272,7 +15232,6 @@ F: drivers/tty/serial/8250/8250_pci1xxxx.c
MICROCHIP POLARFIRE FPGA DRIVERS
M: Conor Dooley <conor.dooley@microchip.com>
R: Vladimir Georgiev <v.georgiev@metrotek.ru>
L: linux-fpga@vger.kernel.org
S: Supported
F: Documentation/devicetree/bindings/fpga/microchip,mpf-spi-fpga-mgr.yaml
@ -15322,6 +15281,7 @@ F: drivers/spi/spi-atmel.*
MICROCHIP SSC DRIVER
M: Claudiu Beznea <claudiu.beznea@tuxon.dev>
M: Andrei Simion <andrei.simion@microchip.com>
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
S: Supported
F: Documentation/devicetree/bindings/misc/atmel-ssc.txt
@ -15527,17 +15487,6 @@ F: arch/mips/
F: drivers/platform/mips/
F: include/dt-bindings/mips/
MIPS BAIKAL-T1 PLATFORM
M: Serge Semin <fancer.lancer@gmail.com>
L: linux-mips@vger.kernel.org
S: Supported
F: Documentation/devicetree/bindings/bus/baikal,bt1-*.yaml
F: Documentation/devicetree/bindings/clock/baikal,bt1-*.yaml
F: drivers/bus/bt1-*.c
F: drivers/clk/baikal-t1/
F: drivers/memory/bt1-l2-ctl.c
F: drivers/mtd/maps/physmap-bt1-rom.[ch]
MIPS BOSTON DEVELOPMENT BOARD
M: Paul Burton <paulburton@kernel.org>
L: linux-mips@vger.kernel.org
@ -15550,7 +15499,6 @@ F: include/dt-bindings/clock/boston-clock.h
MIPS CORE DRIVERS
M: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
M: Serge Semin <fancer.lancer@gmail.com>
L: linux-mips@vger.kernel.org
S: Supported
F: drivers/bus/mips_cdmm.c
@ -16086,6 +16034,7 @@ F: include/uapi/linux/net_dropmon.h
F: net/core/drop_monitor.c
NETWORKING DRIVERS
M: Andrew Lunn <andrew+netdev@lunn.ch>
M: "David S. Miller" <davem@davemloft.net>
M: Eric Dumazet <edumazet@google.com>
M: Jakub Kicinski <kuba@kernel.org>
@ -16151,6 +16100,7 @@ M: "David S. Miller" <davem@davemloft.net>
M: Eric Dumazet <edumazet@google.com>
M: Jakub Kicinski <kuba@kernel.org>
M: Paolo Abeni <pabeni@redhat.com>
R: Simon Horman <horms@kernel.org>
L: netdev@vger.kernel.org
S: Maintained
P: Documentation/process/maintainer-netdev.rst
@ -16193,6 +16143,7 @@ F: include/uapi/linux/rtnetlink.h
F: lib/net_utils.c
F: lib/random32.c
F: net/
F: samples/pktgen/
F: tools/net/
F: tools/testing/selftests/net/
X: Documentation/networking/mac80211-injection.rst
@ -16517,12 +16468,6 @@ F: include/linux/ntb.h
F: include/linux/ntb_transport.h
F: tools/testing/selftests/ntb/
NTB IDT DRIVER
M: Serge Semin <fancer.lancer@gmail.com>
L: ntb@lists.linux.dev
S: Supported
F: drivers/ntb/hw/idt/
NTB INTEL DRIVER
M: Dave Jiang <dave.jiang@intel.com>
L: ntb@lists.linux.dev
@ -18543,13 +18488,6 @@ F: drivers/pps/
F: include/linux/pps*.h
F: include/uapi/linux/pps.h
PPTP DRIVER
M: Dmitry Kozlov <xeb@mail.ru>
L: netdev@vger.kernel.org
S: Maintained
W: http://sourceforge.net/projects/accel-pptp
F: drivers/net/ppp/pptp.c
PRESSURE STALL INFORMATION (PSI)
M: Johannes Weiner <hannes@cmpxchg.org>
M: Suren Baghdasaryan <surenb@google.com>
@ -19523,6 +19461,14 @@ S: Maintained
F: Documentation/tools/rtla/
F: tools/tracing/rtla/
Real-time Linux (PREEMPT_RT)
M: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
M: Clark Williams <clrkwllms@kernel.org>
M: Steven Rostedt <rostedt@goodmis.org>
L: linux-rt-devel@lists.linux.dev
S: Supported
K: PREEMPT_RT
REALTEK AUDIO CODECS
M: Oder Chiou <oder_chiou@realtek.com>
S: Maintained
@ -19632,15 +19578,6 @@ S: Supported
F: Documentation/devicetree/bindings/i2c/renesas,iic-emev2.yaml
F: drivers/i2c/busses/i2c-emev2.c
RENESAS ETHERNET AVB DRIVER
R: Sergey Shtylyov <s.shtylyov@omp.ru>
L: netdev@vger.kernel.org
L: linux-renesas-soc@vger.kernel.org
F: Documentation/devicetree/bindings/net/renesas,etheravb.yaml
F: drivers/net/ethernet/renesas/Kconfig
F: drivers/net/ethernet/renesas/Makefile
F: drivers/net/ethernet/renesas/ravb*
RENESAS ETHERNET SWITCH DRIVER
R: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
L: netdev@vger.kernel.org
@ -19690,14 +19627,6 @@ F: Documentation/devicetree/bindings/i2c/renesas,rmobile-iic.yaml
F: drivers/i2c/busses/i2c-rcar.c
F: drivers/i2c/busses/i2c-sh_mobile.c
RENESAS R-CAR SATA DRIVER
R: Sergey Shtylyov <s.shtylyov@omp.ru>
L: linux-ide@vger.kernel.org
L: linux-renesas-soc@vger.kernel.org
S: Supported
F: Documentation/devicetree/bindings/ata/renesas,rcar-sata.yaml
F: drivers/ata/sata_rcar.c
RENESAS R-CAR THERMAL DRIVERS
M: Niklas Söderlund <niklas.soderlund@ragnatech.se>
L: linux-renesas-soc@vger.kernel.org
@ -19773,16 +19702,6 @@ S: Supported
F: Documentation/devicetree/bindings/i2c/renesas,rzv2m.yaml
F: drivers/i2c/busses/i2c-rzv2m.c
RENESAS SUPERH ETHERNET DRIVER
R: Sergey Shtylyov <s.shtylyov@omp.ru>
L: netdev@vger.kernel.org
L: linux-renesas-soc@vger.kernel.org
F: Documentation/devicetree/bindings/net/renesas,ether.yaml
F: drivers/net/ethernet/renesas/Kconfig
F: drivers/net/ethernet/renesas/Makefile
F: drivers/net/ethernet/renesas/sh_eth*
F: include/linux/sh_eth.h
RENESAS USB PHY DRIVER
M: Yoshihiro Shimoda <yoshihiro.shimoda.uh@renesas.com>
L: linux-renesas-soc@vger.kernel.org
@ -21777,8 +21696,8 @@ F: drivers/accessibility/speakup/
SPEAR PLATFORM/CLOCK/PINCTRL SUPPORT
M: Viresh Kumar <vireshk@kernel.org>
M: Shiraz Hashim <shiraz.linux.kernel@gmail.com>
M: soc@kernel.org
L: linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
L: soc@lists.linux.dev
S: Maintained
W: http://www.st.com/spear
F: arch/arm/boot/dts/st/spear*
@ -22436,19 +22355,11 @@ F: drivers/tty/serial/8250/8250_lpss.c
SYNOPSYS DESIGNWARE APB GPIO DRIVER
M: Hoan Tran <hoan@os.amperecomputing.com>
M: Serge Semin <fancer.lancer@gmail.com>
L: linux-gpio@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/gpio/snps,dw-apb-gpio.yaml
F: drivers/gpio/gpio-dwapb.c
SYNOPSYS DESIGNWARE APB SSI DRIVER
M: Serge Semin <fancer.lancer@gmail.com>
L: linux-spi@vger.kernel.org
S: Supported
F: Documentation/devicetree/bindings/spi/snps,dw-apb-ssi.yaml
F: drivers/spi/spi-dw*
SYNOPSYS DESIGNWARE AXI DMAC DRIVER
M: Eugeniy Paltsev <Eugeniy.Paltsev@synopsys.com>
S: Maintained
@ -23292,7 +23203,7 @@ F: Documentation/devicetree/bindings/iio/adc/ti,lmp92064.yaml
F: drivers/iio/adc/ti-lmp92064.c
TI PCM3060 ASoC CODEC DRIVER
M: Kirill Marinushkin <kmarinushkin@birdec.com>
M: Kirill Marinushkin <k.marinushkin@gmail.com>
L: linux-sound@vger.kernel.org
S: Maintained
F: Documentation/devicetree/bindings/sound/pcm3060.txt
@ -23758,12 +23669,6 @@ L: linux-input@vger.kernel.org
S: Maintained
F: drivers/hid/hid-udraw-ps3.c
UFS FILESYSTEM
M: Evgeniy Dushistov <dushistov@mail.ru>
S: Maintained
F: Documentation/admin-guide/ufs.rst
F: fs/ufs/
UHID USERSPACE HID IO DRIVER
M: David Rheinsberg <david@readahead.eu>
L: linux-input@vger.kernel.org
@ -24066,6 +23971,7 @@ USB RAW GADGET DRIVER
R: Andrey Konovalov <andreyknvl@gmail.com>
L: linux-usb@vger.kernel.org
S: Maintained
B: https://github.com/xairy/raw-gadget/issues
F: Documentation/usb/raw-gadget.rst
F: drivers/usb/gadget/legacy/raw_gadget.c
F: include/uapi/linux/usb/raw_gadget.h
@ -24737,9 +24643,10 @@ F: tools/testing/vsock/
VMA
M: Andrew Morton <akpm@linux-foundation.org>
R: Liam R. Howlett <Liam.Howlett@oracle.com>
M: Liam R. Howlett <Liam.Howlett@oracle.com>
M: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
R: Vlastimil Babka <vbabka@suse.cz>
R: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
R: Jann Horn <jannh@google.com>
L: linux-mm@kvack.org
S: Maintained
W: https://www.linux-mm.org

View File

@ -2,7 +2,7 @@
VERSION = 6
PATCHLEVEL = 12
SUBLEVEL = 0
EXTRAVERSION = -rc3
EXTRAVERSION = -rc6
NAME = Baby Opossum Posse
# *DOCUMENTATION*

View File

@ -838,7 +838,7 @@ config CFI_CLANG
config CFI_ICALL_NORMALIZE_INTEGERS
bool "Normalize CFI tags for integers"
depends on CFI_CLANG
depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS
depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG
help
This option normalizes the CFI tags for integer types so that all
integer types of the same size and signedness receive the same CFI
@ -851,21 +851,19 @@ config CFI_ICALL_NORMALIZE_INTEGERS
This option is necessary for using CFI with Rust. If unsure, say N.
config HAVE_CFI_ICALL_NORMALIZE_INTEGERS
def_bool !GCOV_KERNEL && !KASAN
depends on CFI_CLANG
config HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG
def_bool y
depends on $(cc-option,-fsanitize=kcfi -fsanitize-cfi-icall-experimental-normalize-integers)
help
Is CFI_ICALL_NORMALIZE_INTEGERS supported with the set of compilers
currently in use?
# With GCOV/KASAN we need this fix: https://github.com/llvm/llvm-project/pull/104826
depends on CLANG_VERSION >= 190103 || (!GCOV_KERNEL && !KASAN_GENERIC && !KASAN_SW_TAGS)
This option defaults to false if GCOV or KASAN is enabled, as there is
an LLVM bug that makes normalized integers tags incompatible with
KASAN and GCOV. Kconfig currently does not have the infrastructure to
detect whether your rustc compiler contains the fix for this bug, so
it is assumed that it doesn't. If your compiler has the fix, you can
explicitly enable this option in your config file. The Kconfig logic
needed to detect this will be added in a future kernel release.
config HAVE_CFI_ICALL_NORMALIZE_INTEGERS_RUSTC
def_bool y
depends on HAVE_CFI_ICALL_NORMALIZE_INTEGERS_CLANG
depends on RUSTC_VERSION >= 107900
# With GCOV/KASAN we need this fix: https://github.com/rust-lang/rust/pull/129373
depends on (RUSTC_LLVM_VERSION >= 190103 && RUSTC_VERSION >= 108200) || \
(!GCOV_KERNEL && !KASAN_GENERIC && !KASAN_SW_TAGS)
config CFI_PERMISSIVE
bool "Use CFI in permissive mode"

View File

@ -77,7 +77,7 @@ &gpio {
};
&hdmi {
hpd-gpios = <&expgpio 1 GPIO_ACTIVE_LOW>;
hpd-gpios = <&expgpio 0 GPIO_ACTIVE_LOW>;
power-domains = <&power RPI_POWER_DOMAIN_HDMI>;
status = "okay";
};

View File

@ -136,7 +136,7 @@ cp0_i2c0_pins: cp0-i2c0-pins {
};
cp0_mdio_pins: cp0-mdio-pins {
marvell,pins = "mpp40", "mpp41";
marvell,pins = "mpp0", "mpp1";
marvell,function = "ge";
};

View File

@ -178,6 +178,7 @@ struct kvm_nvhe_init_params {
unsigned long hcr_el2;
unsigned long vttbr;
unsigned long vtcr;
unsigned long tmp;
};
/*

View File

@ -51,6 +51,7 @@
#define KVM_REQ_RELOAD_PMU KVM_ARCH_REQ(5)
#define KVM_REQ_SUSPEND KVM_ARCH_REQ(6)
#define KVM_REQ_RESYNC_PMU_EL0 KVM_ARCH_REQ(7)
#define KVM_REQ_NESTED_S2_UNMAP KVM_ARCH_REQ(8)
#define KVM_DIRTY_LOG_MANUAL_CAPS (KVM_DIRTY_LOG_MANUAL_PROTECT_ENABLE | \
KVM_DIRTY_LOG_INITIALLY_SET)
@ -211,6 +212,12 @@ struct kvm_s2_mmu {
*/
bool nested_stage2_enabled;
/*
* true when this MMU needs to be unmapped before being used for a new
* purpose.
*/
bool pending_unmap;
/*
* 0: Nobody is currently using this, check vttbr for validity
* >0: Somebody is actively using this.

View File

@ -166,7 +166,8 @@ int create_hyp_exec_mappings(phys_addr_t phys_addr, size_t size,
int create_hyp_stack(phys_addr_t phys_addr, unsigned long *haddr);
void __init free_hyp_pgds(void);
void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size);
void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start,
u64 size, bool may_block);
void kvm_stage2_flush_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end);
void kvm_stage2_wp_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end);

View File

@ -78,6 +78,8 @@ extern void kvm_s2_mmu_iterate_by_vmid(struct kvm *kvm, u16 vmid,
extern void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu);
extern void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu);
extern void check_nested_vcpu_requests(struct kvm_vcpu *vcpu);
struct kvm_s2_trans {
phys_addr_t output;
unsigned long block_size;
@ -124,7 +126,7 @@ extern int kvm_s2_handle_perm_fault(struct kvm_vcpu *vcpu,
struct kvm_s2_trans *trans);
extern int kvm_inject_s2_fault(struct kvm_vcpu *vcpu, u64 esr_el2);
extern void kvm_nested_s2_wp(struct kvm *kvm);
extern void kvm_nested_s2_unmap(struct kvm *kvm);
extern void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block);
extern void kvm_nested_s2_flush(struct kvm *kvm);
unsigned long compute_tlb_inval_range(struct kvm_s2_mmu *mmu, u64 val);

View File

@ -10,11 +10,9 @@
#include <asm/insn.h>
#include <asm/probes.h>
#define MAX_UINSN_BYTES AARCH64_INSN_SIZE
#define UPROBE_SWBP_INSN cpu_to_le32(BRK64_OPCODE_UPROBES)
#define UPROBE_SWBP_INSN_SIZE AARCH64_INSN_SIZE
#define UPROBE_XOL_SLOT_BYTES MAX_UINSN_BYTES
#define UPROBE_XOL_SLOT_BYTES AARCH64_INSN_SIZE
typedef __le32 uprobe_opcode_t;
@ -23,8 +21,8 @@ struct arch_uprobe_task {
struct arch_uprobe {
union {
u8 insn[MAX_UINSN_BYTES];
u8 ixol[MAX_UINSN_BYTES];
__le32 insn;
__le32 ixol;
};
struct arch_probe_insn api;
bool simulate;

View File

@ -146,6 +146,7 @@ int main(void)
DEFINE(NVHE_INIT_HCR_EL2, offsetof(struct kvm_nvhe_init_params, hcr_el2));
DEFINE(NVHE_INIT_VTTBR, offsetof(struct kvm_nvhe_init_params, vttbr));
DEFINE(NVHE_INIT_VTCR, offsetof(struct kvm_nvhe_init_params, vtcr));
DEFINE(NVHE_INIT_TMP, offsetof(struct kvm_nvhe_init_params, tmp));
#endif
#ifdef CONFIG_CPU_PM
DEFINE(CPU_CTX_SP, offsetof(struct cpu_suspend_ctx, sp));

View File

@ -99,10 +99,6 @@ arm_probe_decode_insn(probe_opcode_t insn, struct arch_probe_insn *api)
aarch64_insn_is_blr(insn) ||
aarch64_insn_is_ret(insn)) {
api->handler = simulate_br_blr_ret;
} else if (aarch64_insn_is_ldr_lit(insn)) {
api->handler = simulate_ldr_literal;
} else if (aarch64_insn_is_ldrsw_lit(insn)) {
api->handler = simulate_ldrsw_literal;
} else {
/*
* Instruction cannot be stepped out-of-line and we don't
@ -140,6 +136,17 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi)
probe_opcode_t insn = le32_to_cpu(*addr);
probe_opcode_t *scan_end = NULL;
unsigned long size = 0, offset = 0;
struct arch_probe_insn *api = &asi->api;
if (aarch64_insn_is_ldr_lit(insn)) {
api->handler = simulate_ldr_literal;
decoded = INSN_GOOD_NO_SLOT;
} else if (aarch64_insn_is_ldrsw_lit(insn)) {
api->handler = simulate_ldrsw_literal;
decoded = INSN_GOOD_NO_SLOT;
} else {
decoded = arm_probe_decode_insn(insn, &asi->api);
}
/*
* If there's a symbol defined in front of and near enough to
@ -157,7 +164,6 @@ arm_kprobe_decode_insn(kprobe_opcode_t *addr, struct arch_specific_insn *asi)
else
scan_end = addr - MAX_ATOMIC_CONTEXT_SIZE;
}
decoded = arm_probe_decode_insn(insn, &asi->api);
if (decoded != INSN_REJECTED && scan_end)
if (is_probed_address_atomic(addr - 1, scan_end))

View File

@ -171,17 +171,15 @@ simulate_tbz_tbnz(u32 opcode, long addr, struct pt_regs *regs)
void __kprobes
simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs)
{
u64 *load_addr;
unsigned long load_addr;
int xn = opcode & 0x1f;
int disp;
disp = ldr_displacement(opcode);
load_addr = (u64 *) (addr + disp);
load_addr = addr + ldr_displacement(opcode);
if (opcode & (1 << 30)) /* x0-x30 */
set_x_reg(regs, xn, *load_addr);
set_x_reg(regs, xn, READ_ONCE(*(u64 *)load_addr));
else /* w0-w30 */
set_w_reg(regs, xn, *load_addr);
set_w_reg(regs, xn, READ_ONCE(*(u32 *)load_addr));
instruction_pointer_set(regs, instruction_pointer(regs) + 4);
}
@ -189,14 +187,12 @@ simulate_ldr_literal(u32 opcode, long addr, struct pt_regs *regs)
void __kprobes
simulate_ldrsw_literal(u32 opcode, long addr, struct pt_regs *regs)
{
s32 *load_addr;
unsigned long load_addr;
int xn = opcode & 0x1f;
int disp;
disp = ldr_displacement(opcode);
load_addr = (s32 *) (addr + disp);
load_addr = addr + ldr_displacement(opcode);
set_x_reg(regs, xn, *load_addr);
set_x_reg(regs, xn, READ_ONCE(*(s32 *)load_addr));
instruction_pointer_set(regs, instruction_pointer(regs) + 4);
}

View File

@ -42,7 +42,7 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm,
else if (!IS_ALIGNED(addr, AARCH64_INSN_SIZE))
return -EINVAL;
insn = *(probe_opcode_t *)(&auprobe->insn[0]);
insn = le32_to_cpu(auprobe->insn);
switch (arm_probe_decode_insn(insn, &auprobe->api)) {
case INSN_REJECTED:
@ -108,7 +108,7 @@ bool arch_uprobe_skip_sstep(struct arch_uprobe *auprobe, struct pt_regs *regs)
if (!auprobe->simulate)
return false;
insn = *(probe_opcode_t *)(&auprobe->insn[0]);
insn = le32_to_cpu(auprobe->insn);
addr = instruction_pointer(regs);
if (auprobe->api.handler)

View File

@ -412,6 +412,9 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args)
p->thread.cpu_context.x19 = (unsigned long)args->fn;
p->thread.cpu_context.x20 = (unsigned long)args->fn_arg;
if (system_supports_poe())
p->thread.por_el0 = POR_EL0_INIT;
}
p->thread.cpu_context.pc = (unsigned long)ret_from_fork;
p->thread.cpu_context.sp = (unsigned long)childregs;

View File

@ -19,6 +19,7 @@
#include <linux/ratelimit.h>
#include <linux/rseq.h>
#include <linux/syscalls.h>
#include <linux/pkeys.h>
#include <asm/daifflags.h>
#include <asm/debug-monitors.h>
@ -66,10 +67,63 @@ struct rt_sigframe_user_layout {
unsigned long end_offset;
};
/*
* Holds any EL0-controlled state that influences unprivileged memory accesses.
* This includes both accesses done in userspace and uaccess done in the kernel.
*
* This state needs to be carefully managed to ensure that it doesn't cause
* uaccess to fail when setting up the signal frame, and the signal handler
* itself also expects a well-defined state when entered.
*/
struct user_access_state {
u64 por_el0;
};
#define BASE_SIGFRAME_SIZE round_up(sizeof(struct rt_sigframe), 16)
#define TERMINATOR_SIZE round_up(sizeof(struct _aarch64_ctx), 16)
#define EXTRA_CONTEXT_SIZE round_up(sizeof(struct extra_context), 16)
/*
* Save the user access state into ua_state and reset it to disable any
* restrictions.
*/
static void save_reset_user_access_state(struct user_access_state *ua_state)
{
if (system_supports_poe()) {
u64 por_enable_all = 0;
for (int pkey = 0; pkey < arch_max_pkey(); pkey++)
por_enable_all |= POE_RXW << (pkey * POR_BITS_PER_PKEY);
ua_state->por_el0 = read_sysreg_s(SYS_POR_EL0);
write_sysreg_s(por_enable_all, SYS_POR_EL0);
/* Ensure that any subsequent uaccess observes the updated value */
isb();
}
}
/*
* Set the user access state for invoking the signal handler.
*
* No uaccess should be done after that function is called.
*/
static void set_handler_user_access_state(void)
{
if (system_supports_poe())
write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
}
/*
* Restore the user access state to the values saved in ua_state.
*
* No uaccess should be done after that function is called.
*/
static void restore_user_access_state(const struct user_access_state *ua_state)
{
if (system_supports_poe())
write_sysreg_s(ua_state->por_el0, SYS_POR_EL0);
}
static void init_user_layout(struct rt_sigframe_user_layout *user)
{
const size_t reserved_size =
@ -261,18 +315,20 @@ static int restore_fpmr_context(struct user_ctxs *user)
return err;
}
static int preserve_poe_context(struct poe_context __user *ctx)
static int preserve_poe_context(struct poe_context __user *ctx,
const struct user_access_state *ua_state)
{
int err = 0;
__put_user_error(POE_MAGIC, &ctx->head.magic, err);
__put_user_error(sizeof(*ctx), &ctx->head.size, err);
__put_user_error(read_sysreg_s(SYS_POR_EL0), &ctx->por_el0, err);
__put_user_error(ua_state->por_el0, &ctx->por_el0, err);
return err;
}
static int restore_poe_context(struct user_ctxs *user)
static int restore_poe_context(struct user_ctxs *user,
struct user_access_state *ua_state)
{
u64 por_el0;
int err = 0;
@ -282,7 +338,7 @@ static int restore_poe_context(struct user_ctxs *user)
__get_user_error(por_el0, &(user->poe->por_el0), err);
if (!err)
write_sysreg_s(por_el0, SYS_POR_EL0);
ua_state->por_el0 = por_el0;
return err;
}
@ -850,7 +906,8 @@ static int parse_user_sigframe(struct user_ctxs *user,
}
static int restore_sigframe(struct pt_regs *regs,
struct rt_sigframe __user *sf)
struct rt_sigframe __user *sf,
struct user_access_state *ua_state)
{
sigset_t set;
int i, err;
@ -899,7 +956,7 @@ static int restore_sigframe(struct pt_regs *regs,
err = restore_zt_context(&user);
if (err == 0 && system_supports_poe() && user.poe)
err = restore_poe_context(&user);
err = restore_poe_context(&user, ua_state);
return err;
}
@ -908,6 +965,7 @@ SYSCALL_DEFINE0(rt_sigreturn)
{
struct pt_regs *regs = current_pt_regs();
struct rt_sigframe __user *frame;
struct user_access_state ua_state;
/* Always make any pending restarted system calls return -EINTR */
current->restart_block.fn = do_no_restart_syscall;
@ -924,12 +982,14 @@ SYSCALL_DEFINE0(rt_sigreturn)
if (!access_ok(frame, sizeof (*frame)))
goto badframe;
if (restore_sigframe(regs, frame))
if (restore_sigframe(regs, frame, &ua_state))
goto badframe;
if (restore_altstack(&frame->uc.uc_stack))
goto badframe;
restore_user_access_state(&ua_state);
return regs->regs[0];
badframe:
@ -1035,7 +1095,8 @@ static int setup_sigframe_layout(struct rt_sigframe_user_layout *user,
}
static int setup_sigframe(struct rt_sigframe_user_layout *user,
struct pt_regs *regs, sigset_t *set)
struct pt_regs *regs, sigset_t *set,
const struct user_access_state *ua_state)
{
int i, err = 0;
struct rt_sigframe __user *sf = user->sigframe;
@ -1097,10 +1158,9 @@ static int setup_sigframe(struct rt_sigframe_user_layout *user,
struct poe_context __user *poe_ctx =
apply_user_offset(user, user->poe_offset);
err |= preserve_poe_context(poe_ctx);
err |= preserve_poe_context(poe_ctx, ua_state);
}
/* ZA state if present */
if (system_supports_sme() && err == 0 && user->za_offset) {
struct za_context __user *za_ctx =
@ -1237,9 +1297,6 @@ static void setup_return(struct pt_regs *regs, struct k_sigaction *ka,
sme_smstop();
}
if (system_supports_poe())
write_sysreg_s(POR_EL0_INIT, SYS_POR_EL0);
if (ka->sa.sa_flags & SA_RESTORER)
sigtramp = ka->sa.sa_restorer;
else
@ -1253,6 +1310,7 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
{
struct rt_sigframe_user_layout user;
struct rt_sigframe __user *frame;
struct user_access_state ua_state;
int err = 0;
fpsimd_signal_preserve_current_state();
@ -1260,13 +1318,14 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
if (get_sigframe(&user, ksig, regs))
return 1;
save_reset_user_access_state(&ua_state);
frame = user.sigframe;
__put_user_error(0, &frame->uc.uc_flags, err);
__put_user_error(NULL, &frame->uc.uc_link, err);
err |= __save_altstack(&frame->uc.uc_stack, regs->sp);
err |= setup_sigframe(&user, regs, set);
err |= setup_sigframe(&user, regs, set, &ua_state);
if (err == 0) {
setup_return(regs, &ksig->ka, &user, usig);
if (ksig->ka.sa.sa_flags & SA_SIGINFO) {
@ -1276,6 +1335,11 @@ static int setup_rt_frame(int usig, struct ksignal *ksig, sigset_t *set,
}
}
if (err == 0)
set_handler_user_access_state();
else
restore_user_access_state(&ua_state);
return err;
}

View File

@ -997,6 +997,9 @@ static int kvm_vcpu_suspend(struct kvm_vcpu *vcpu)
static int check_vcpu_requests(struct kvm_vcpu *vcpu)
{
if (kvm_request_pending(vcpu)) {
if (kvm_check_request(KVM_REQ_VM_DEAD, vcpu))
return -EIO;
if (kvm_check_request(KVM_REQ_SLEEP, vcpu))
kvm_vcpu_sleep(vcpu);
@ -1031,6 +1034,8 @@ static int check_vcpu_requests(struct kvm_vcpu *vcpu)
if (kvm_dirty_ring_check_request(vcpu))
return 0;
check_nested_vcpu_requests(vcpu);
}
return 1;

View File

@ -24,28 +24,25 @@
.align 11
SYM_CODE_START(__kvm_hyp_init)
ventry __invalid // Synchronous EL2t
ventry __invalid // IRQ EL2t
ventry __invalid // FIQ EL2t
ventry __invalid // Error EL2t
ventry . // Synchronous EL2t
ventry . // IRQ EL2t
ventry . // FIQ EL2t
ventry . // Error EL2t
ventry __invalid // Synchronous EL2h
ventry __invalid // IRQ EL2h
ventry __invalid // FIQ EL2h
ventry __invalid // Error EL2h
ventry . // Synchronous EL2h
ventry . // IRQ EL2h
ventry . // FIQ EL2h
ventry . // Error EL2h
ventry __do_hyp_init // Synchronous 64-bit EL1
ventry __invalid // IRQ 64-bit EL1
ventry __invalid // FIQ 64-bit EL1
ventry __invalid // Error 64-bit EL1
ventry . // IRQ 64-bit EL1
ventry . // FIQ 64-bit EL1
ventry . // Error 64-bit EL1
ventry __invalid // Synchronous 32-bit EL1
ventry __invalid // IRQ 32-bit EL1
ventry __invalid // FIQ 32-bit EL1
ventry __invalid // Error 32-bit EL1
__invalid:
b .
ventry . // Synchronous 32-bit EL1
ventry . // IRQ 32-bit EL1
ventry . // FIQ 32-bit EL1
ventry . // Error 32-bit EL1
/*
* Only uses x0..x3 so as to not clobber callee-saved SMCCC registers.
@ -76,6 +73,13 @@ __do_hyp_init:
eret
SYM_CODE_END(__kvm_hyp_init)
SYM_CODE_START_LOCAL(__kvm_init_el2_state)
/* Initialize EL2 CPU state to sane values. */
init_el2_state // Clobbers x0..x2
finalise_el2_state
ret
SYM_CODE_END(__kvm_init_el2_state)
/*
* Initialize the hypervisor in EL2.
*
@ -102,9 +106,12 @@ SYM_CODE_START_LOCAL(___kvm_hyp_init)
// TPIDR_EL2 is used to preserve x0 across the macro maze...
isb
msr tpidr_el2, x0
init_el2_state
finalise_el2_state
str lr, [x0, #NVHE_INIT_TMP]
bl __kvm_init_el2_state
mrs x0, tpidr_el2
ldr lr, [x0, #NVHE_INIT_TMP]
1:
ldr x1, [x0, #NVHE_INIT_TPIDR_EL2]
@ -199,9 +206,8 @@ SYM_CODE_START_LOCAL(__kvm_hyp_init_cpu)
2: msr SPsel, #1 // We want to use SP_EL{1,2}
/* Initialize EL2 CPU state to sane values. */
init_el2_state // Clobbers x0..x2
finalise_el2_state
bl __kvm_init_el2_state
__init_el2_nvhe_prepare_eret
/* Enable MMU, set vectors and stack. */

View File

@ -317,7 +317,7 @@ int kvm_smccc_call_handler(struct kvm_vcpu *vcpu)
* to the guest, and hide SSBS so that the
* guest stays protected.
*/
if (cpus_have_final_cap(ARM64_SSBS))
if (kvm_has_feat(vcpu->kvm, ID_AA64PFR1_EL1, SSBS, IMP))
break;
fallthrough;
case SPECTRE_UNAFFECTED:
@ -428,7 +428,7 @@ int kvm_arm_copy_fw_reg_indices(struct kvm_vcpu *vcpu, u64 __user *uindices)
* Convert the workaround level into an easy-to-compare number, where higher
* values mean better protection.
*/
static int get_kernel_wa_level(u64 regid)
static int get_kernel_wa_level(struct kvm_vcpu *vcpu, u64 regid)
{
switch (regid) {
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
@ -449,7 +449,7 @@ static int get_kernel_wa_level(u64 regid)
* don't have any FW mitigation if SSBS is there at
* all times.
*/
if (cpus_have_final_cap(ARM64_SSBS))
if (kvm_has_feat(vcpu->kvm, ID_AA64PFR1_EL1, SSBS, IMP))
return KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2_NOT_AVAIL;
fallthrough;
case SPECTRE_UNAFFECTED:
@ -486,7 +486,7 @@ int kvm_arm_get_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_1:
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_2:
case KVM_REG_ARM_SMCCC_ARCH_WORKAROUND_3:
val = get_kernel_wa_level(reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
val = get_kernel_wa_level(vcpu, reg->id) & KVM_REG_FEATURE_LEVEL_MASK;
break;
case KVM_REG_ARM_STD_BMAP:
val = READ_ONCE(smccc_feat->std_bmap);
@ -588,7 +588,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
if (val & ~KVM_REG_FEATURE_LEVEL_MASK)
return -EINVAL;
if (get_kernel_wa_level(reg->id) < val)
if (get_kernel_wa_level(vcpu, reg->id) < val)
return -EINVAL;
return 0;
@ -624,7 +624,7 @@ int kvm_arm_set_fw_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *reg)
* We can deal with NOT_AVAIL on NOT_REQUIRED, but not the
* other way around.
*/
if (get_kernel_wa_level(reg->id) < wa_level)
if (get_kernel_wa_level(vcpu, reg->id) < wa_level)
return -EINVAL;
return 0;

View File

@ -328,9 +328,10 @@ static void __unmap_stage2_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64
may_block));
}
void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start, u64 size)
void kvm_stage2_unmap_range(struct kvm_s2_mmu *mmu, phys_addr_t start,
u64 size, bool may_block)
{
__unmap_stage2_range(mmu, start, size, true);
__unmap_stage2_range(mmu, start, size, may_block);
}
void kvm_stage2_flush_range(struct kvm_s2_mmu *mmu, phys_addr_t addr, phys_addr_t end)
@ -1015,7 +1016,7 @@ static void stage2_unmap_memslot(struct kvm *kvm,
if (!(vma->vm_flags & VM_PFNMAP)) {
gpa_t gpa = addr + (vm_start - memslot->userspace_addr);
kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, vm_end - vm_start);
kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, vm_end - vm_start, true);
}
hva = vm_end;
} while (hva < reg_end);
@ -1042,7 +1043,7 @@ void stage2_unmap_vm(struct kvm *kvm)
kvm_for_each_memslot(memslot, bkt, slots)
stage2_unmap_memslot(kvm, memslot);
kvm_nested_s2_unmap(kvm);
kvm_nested_s2_unmap(kvm, true);
write_unlock(&kvm->mmu_lock);
mmap_read_unlock(current->mm);
@ -1912,7 +1913,7 @@ bool kvm_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range)
(range->end - range->start) << PAGE_SHIFT,
range->may_block);
kvm_nested_s2_unmap(kvm);
kvm_nested_s2_unmap(kvm, range->may_block);
return false;
}
@ -2179,8 +2180,8 @@ void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
phys_addr_t size = slot->npages << PAGE_SHIFT;
write_lock(&kvm->mmu_lock);
kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, size);
kvm_nested_s2_unmap(kvm);
kvm_stage2_unmap_range(&kvm->arch.mmu, gpa, size, true);
kvm_nested_s2_unmap(kvm, true);
write_unlock(&kvm->mmu_lock);
}

View File

@ -632,9 +632,9 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu)
/* Set the scene for the next search */
kvm->arch.nested_mmus_next = (i + 1) % kvm->arch.nested_mmus_size;
/* Clear the old state */
/* Make sure we don't forget to do the laundry */
if (kvm_s2_mmu_valid(s2_mmu))
kvm_stage2_unmap_range(s2_mmu, 0, kvm_phys_size(s2_mmu));
s2_mmu->pending_unmap = true;
/*
* The virtual VMID (modulo CnP) will be used as a key when matching
@ -650,6 +650,16 @@ static struct kvm_s2_mmu *get_s2_mmu_nested(struct kvm_vcpu *vcpu)
out:
atomic_inc(&s2_mmu->refcnt);
/*
* Set the vCPU request to perform an unmap, even if the pending unmap
* originates from another vCPU. This guarantees that the MMU has been
* completely unmapped before any vCPU actually uses it, and allows
* multiple vCPUs to lend a hand with completing the unmap.
*/
if (s2_mmu->pending_unmap)
kvm_make_request(KVM_REQ_NESTED_S2_UNMAP, vcpu);
return s2_mmu;
}
@ -663,6 +673,13 @@ void kvm_init_nested_s2_mmu(struct kvm_s2_mmu *mmu)
void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu)
{
/*
* The vCPU kept its reference on the MMU after the last put, keep
* rolling with it.
*/
if (vcpu->arch.hw_mmu)
return;
if (is_hyp_ctxt(vcpu)) {
vcpu->arch.hw_mmu = &vcpu->kvm->arch.mmu;
} else {
@ -674,10 +691,18 @@ void kvm_vcpu_load_hw_mmu(struct kvm_vcpu *vcpu)
void kvm_vcpu_put_hw_mmu(struct kvm_vcpu *vcpu)
{
if (kvm_is_nested_s2_mmu(vcpu->kvm, vcpu->arch.hw_mmu)) {
/*
* Keep a reference on the associated stage-2 MMU if the vCPU is
* scheduling out and not in WFI emulation, suggesting it is likely to
* reuse the MMU sometime soon.
*/
if (vcpu->scheduled_out && !vcpu_get_flag(vcpu, IN_WFI))
return;
if (kvm_is_nested_s2_mmu(vcpu->kvm, vcpu->arch.hw_mmu))
atomic_dec(&vcpu->arch.hw_mmu->refcnt);
vcpu->arch.hw_mmu = NULL;
}
vcpu->arch.hw_mmu = NULL;
}
/*
@ -730,7 +755,7 @@ void kvm_nested_s2_wp(struct kvm *kvm)
}
}
void kvm_nested_s2_unmap(struct kvm *kvm)
void kvm_nested_s2_unmap(struct kvm *kvm, bool may_block)
{
int i;
@ -740,7 +765,7 @@ void kvm_nested_s2_unmap(struct kvm *kvm)
struct kvm_s2_mmu *mmu = &kvm->arch.nested_mmus[i];
if (kvm_s2_mmu_valid(mmu))
kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu));
kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), may_block);
}
}
@ -1184,3 +1209,17 @@ int kvm_init_nv_sysregs(struct kvm *kvm)
return 0;
}
void check_nested_vcpu_requests(struct kvm_vcpu *vcpu)
{
if (kvm_check_request(KVM_REQ_NESTED_S2_UNMAP, vcpu)) {
struct kvm_s2_mmu *mmu = vcpu->arch.hw_mmu;
write_lock(&vcpu->kvm->mmu_lock);
if (mmu->pending_unmap) {
kvm_stage2_unmap_range(mmu, 0, kvm_phys_size(mmu), true);
mmu->pending_unmap = false;
}
write_unlock(&vcpu->kvm->mmu_lock);
}
}

View File

@ -1527,6 +1527,14 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu,
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE);
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_SME);
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_RNDR_trap);
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_NMI);
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTE_frac);
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_GCS);
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_THE);
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_MTEX);
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_DF2);
val &= ~ARM64_FEATURE_MASK(ID_AA64PFR1_EL1_PFAR);
break;
case SYS_ID_AA64PFR2_EL1:
/* We only expose FPMR */
@ -1550,7 +1558,8 @@ static u64 __kvm_read_sanitised_id_reg(const struct kvm_vcpu *vcpu,
val &= ~ID_AA64MMFR2_EL1_CCIDX_MASK;
break;
case SYS_ID_AA64MMFR3_EL1:
val &= ID_AA64MMFR3_EL1_TCRX | ID_AA64MMFR3_EL1_S1POE;
val &= ID_AA64MMFR3_EL1_TCRX | ID_AA64MMFR3_EL1_S1POE |
ID_AA64MMFR3_EL1_S1PIE;
break;
case SYS_ID_MMFR4_EL1:
val &= ~ARM64_FEATURE_MASK(ID_MMFR4_EL1_CCIDX);
@ -1985,7 +1994,7 @@ static u64 reset_clidr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r)
* one cache line.
*/
if (kvm_has_mte(vcpu->kvm))
clidr |= 2 << CLIDR_TTYPE_SHIFT(loc);
clidr |= 2ULL << CLIDR_TTYPE_SHIFT(loc);
__vcpu_sys_reg(vcpu, r->reg) = clidr;
@ -2376,7 +2385,19 @@ static const struct sys_reg_desc sys_reg_descs[] = {
ID_AA64PFR0_EL1_RAS |
ID_AA64PFR0_EL1_AdvSIMD |
ID_AA64PFR0_EL1_FP), },
ID_SANITISED(ID_AA64PFR1_EL1),
ID_WRITABLE(ID_AA64PFR1_EL1, ~(ID_AA64PFR1_EL1_PFAR |
ID_AA64PFR1_EL1_DF2 |
ID_AA64PFR1_EL1_MTEX |
ID_AA64PFR1_EL1_THE |
ID_AA64PFR1_EL1_GCS |
ID_AA64PFR1_EL1_MTE_frac |
ID_AA64PFR1_EL1_NMI |
ID_AA64PFR1_EL1_RNDR_trap |
ID_AA64PFR1_EL1_SME |
ID_AA64PFR1_EL1_RES0 |
ID_AA64PFR1_EL1_MPAM_frac |
ID_AA64PFR1_EL1_RAS_frac |
ID_AA64PFR1_EL1_MTE)),
ID_WRITABLE(ID_AA64PFR2_EL1, ID_AA64PFR2_EL1_FPMR),
ID_UNALLOCATED(4,3),
ID_WRITABLE(ID_AA64ZFR0_EL1, ~ID_AA64ZFR0_EL1_RES0),
@ -2390,7 +2411,21 @@ static const struct sys_reg_desc sys_reg_descs[] = {
.get_user = get_id_reg,
.set_user = set_id_aa64dfr0_el1,
.reset = read_sanitised_id_aa64dfr0_el1,
.val = ID_AA64DFR0_EL1_PMUVer_MASK |
/*
* Prior to FEAT_Debugv8.9, the architecture defines context-aware
* breakpoints (CTX_CMPs) as the highest numbered breakpoints (BRPs).
* KVM does not trap + emulate the breakpoint registers, and as such
* cannot support a layout that misaligns with the underlying hardware.
* While it may be possible to describe a subset that aligns with
* hardware, just prevent changes to BRPs and CTX_CMPs altogether for
* simplicity.
*
* See DDI0487K.a, section D2.8.3 Breakpoint types and linking
* of breakpoints for more details.
*/
.val = ID_AA64DFR0_EL1_DoubleLock_MASK |
ID_AA64DFR0_EL1_WRPs_MASK |
ID_AA64DFR0_EL1_PMUVer_MASK |
ID_AA64DFR0_EL1_DebugVer_MASK, },
ID_SANITISED(ID_AA64DFR1_EL1),
ID_UNALLOCATED(5,2),
@ -2433,6 +2468,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
ID_AA64MMFR2_EL1_NV |
ID_AA64MMFR2_EL1_CCIDX)),
ID_WRITABLE(ID_AA64MMFR3_EL1, (ID_AA64MMFR3_EL1_TCRX |
ID_AA64MMFR3_EL1_S1PIE |
ID_AA64MMFR3_EL1_S1POE)),
ID_SANITISED(ID_AA64MMFR4_EL1),
ID_UNALLOCATED(7,5),
@ -2903,7 +2939,7 @@ static bool handle_alle1is(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
* Drop all shadow S2s, resulting in S1/S2 TLBIs for each of the
* corresponding VMIDs.
*/
kvm_nested_s2_unmap(vcpu->kvm);
kvm_nested_s2_unmap(vcpu->kvm, true);
write_unlock(&vcpu->kvm->mmu_lock);
@ -2955,7 +2991,30 @@ union tlbi_info {
static void s2_mmu_unmap_range(struct kvm_s2_mmu *mmu,
const union tlbi_info *info)
{
kvm_stage2_unmap_range(mmu, info->range.start, info->range.size);
/*
* The unmap operation is allowed to drop the MMU lock and block, which
* means that @mmu could be used for a different context than the one
* currently being invalidated.
*
* This behavior is still safe, as:
*
* 1) The vCPU(s) that recycled the MMU are responsible for invalidating
* the entire MMU before reusing it, which still honors the intent
* of a TLBI.
*
* 2) Until the guest TLBI instruction is 'retired' (i.e. increment PC
* and ERET to the guest), other vCPUs are allowed to use stale
* translations.
*
* 3) Accidentally unmapping an unrelated MMU context is nonfatal, and
* at worst may cause more aborts for shadow stage-2 fills.
*
* Dropping the MMU lock also implies that shadow stage-2 fills could
* happen behind the back of the TLBI. This is still safe, though, as
* the L1 needs to put its stage-2 in a consistent state before doing
* the TLBI.
*/
kvm_stage2_unmap_range(mmu, info->range.start, info->range.size, true);
}
static bool handle_vmalls12e1is(struct kvm_vcpu *vcpu, struct sys_reg_params *p,
@ -3050,7 +3109,11 @@ static void s2_mmu_unmap_ipa(struct kvm_s2_mmu *mmu,
max_size = compute_tlb_inval_range(mmu, info->ipa.addr);
base_addr &= ~(max_size - 1);
kvm_stage2_unmap_range(mmu, base_addr, max_size);
/*
* See comment in s2_mmu_unmap_range() for why this is allowed to
* reschedule.
*/
kvm_stage2_unmap_range(mmu, base_addr, max_size, true);
}
static bool handle_ipas2e1is(struct kvm_vcpu *vcpu, struct sys_reg_params *p,

View File

@ -417,8 +417,28 @@ static void __kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
kfree(vgic_cpu->private_irqs);
vgic_cpu->private_irqs = NULL;
if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3)
if (vcpu->kvm->arch.vgic.vgic_model == KVM_DEV_TYPE_ARM_VGIC_V3) {
/*
* If this vCPU is being destroyed because of a failed creation
* then unregister the redistributor to avoid leaving behind a
* dangling pointer to the vCPU struct.
*
* vCPUs that have been successfully created (i.e. added to
* kvm->vcpu_array) get unregistered in kvm_vgic_destroy(), as
* this function gets called while holding kvm->arch.config_lock
* in the VM teardown path and would otherwise introduce a lock
* inversion w.r.t. kvm->srcu.
*
* vCPUs that failed creation are torn down outside of the
* kvm->arch.config_lock and do not get unregistered in
* kvm_vgic_destroy(), meaning it is both safe and necessary to
* do so here.
*/
if (kvm_get_vcpu_by_id(vcpu->kvm, vcpu->vcpu_id) != vcpu)
vgic_unregister_redist_iodev(vcpu);
vgic_cpu->rd_iodev.base_addr = VGIC_ADDR_UNDEF;
}
}
void kvm_vgic_vcpu_destroy(struct kvm_vcpu *vcpu)
@ -524,22 +544,31 @@ int kvm_vgic_map_resources(struct kvm *kvm)
if (ret)
goto out;
dist->ready = true;
dist_base = dist->vgic_dist_base;
mutex_unlock(&kvm->arch.config_lock);
ret = vgic_register_dist_iodev(kvm, dist_base, type);
if (ret)
if (ret) {
kvm_err("Unable to register VGIC dist MMIO regions\n");
goto out_slots;
}
/*
* kvm_io_bus_register_dev() guarantees all readers see the new MMIO
* registration before returning through synchronize_srcu(), which also
* implies a full memory barrier. As such, marking the distributor as
* 'ready' here is guaranteed to be ordered after all vCPUs having seen
* a completely configured distributor.
*/
dist->ready = true;
goto out_slots;
out:
mutex_unlock(&kvm->arch.config_lock);
out_slots:
mutex_unlock(&kvm->slots_lock);
if (ret)
kvm_vgic_destroy(kvm);
kvm_vm_dead(kvm);
mutex_unlock(&kvm->slots_lock);
return ret;
}

View File

@ -236,7 +236,12 @@ static int vgic_set_common_attr(struct kvm_device *dev,
mutex_lock(&dev->kvm->arch.config_lock);
if (vgic_ready(dev->kvm) || dev->kvm->arch.vgic.nr_spis)
/*
* Either userspace has already configured NR_IRQS or
* the vgic has already been initialized and vgic_init()
* supplied a default amount of SPIs.
*/
if (dev->kvm->arch.vgic.nr_spis)
ret = -EBUSY;
else
dev->kvm->arch.vgic.nr_spis =

View File

@ -2220,7 +2220,11 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
emit(A64_STR64I(A64_R(20), A64_SP, regs_off + 8), ctx);
if (flags & BPF_TRAMP_F_CALL_ORIG) {
emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);
/* for the first pass, assume the worst case */
if (!ctx->image)
ctx->idx += 4;
else
emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);
emit_call((const u64)__bpf_tramp_enter, ctx);
}
@ -2264,7 +2268,11 @@ static int prepare_trampoline(struct jit_ctx *ctx, struct bpf_tramp_image *im,
if (flags & BPF_TRAMP_F_CALL_ORIG) {
im->ip_epilogue = ctx->ro_image + ctx->idx;
emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);
/* for the first pass, assume the worst case */
if (!ctx->image)
ctx->idx += 4;
else
emit_a64_mov_i64(A64_R(0), (const u64)im, ctx);
emit_call((const u64)__bpf_tramp_exit, ctx);
}

View File

@ -26,6 +26,10 @@ struct loongson_board_info {
#define NR_WORDS DIV_ROUND_UP(NR_CPUS, BITS_PER_LONG)
/*
* The "core" of cores_per_node and cores_per_package stands for a
* logical core, which means in a SMT system it stands for a thread.
*/
struct loongson_system_configuration {
int nr_cpus;
int nr_nodes;

View File

@ -16,7 +16,7 @@
#define XRANGE_SHIFT (48)
/* Valid address length */
#define XRANGE_SHADOW_SHIFT (PGDIR_SHIFT + PAGE_SHIFT - 3)
#define XRANGE_SHADOW_SHIFT min(cpu_vabits, VA_BITS)
/* Used for taking out the valid address */
#define XRANGE_SHADOW_MASK GENMASK_ULL(XRANGE_SHADOW_SHIFT - 1, 0)
/* One segment whole address space size */

View File

@ -250,7 +250,7 @@
#define CSR_ESTAT_IS_WIDTH 15
#define CSR_ESTAT_IS (_ULCAST_(0x7fff) << CSR_ESTAT_IS_SHIFT)
#define LOONGARCH_CSR_ERA 0x6 /* ERA */
#define LOONGARCH_CSR_ERA 0x6 /* Exception return address */
#define LOONGARCH_CSR_BADV 0x7 /* Bad virtual address */

View File

@ -10,6 +10,7 @@
#define __HAVE_ARCH_PMD_ALLOC_ONE
#define __HAVE_ARCH_PUD_ALLOC_ONE
#define __HAVE_ARCH_PTE_ALLOC_ONE_KERNEL
#include <asm-generic/pgalloc.h>
static inline void pmd_populate_kernel(struct mm_struct *mm,
@ -44,6 +45,16 @@ extern void pagetable_init(void);
extern pgd_t *pgd_alloc(struct mm_struct *mm);
static inline pte_t *pte_alloc_one_kernel(struct mm_struct *mm)
{
pte_t *pte = __pte_alloc_one_kernel(mm);
if (pte)
kernel_pte_init(pte);
return pte;
}
#define __pte_free_tlb(tlb, pte, address) \
do { \
pagetable_pte_dtor(page_ptdesc(pte)); \

View File

@ -269,6 +269,7 @@ extern void set_pmd_at(struct mm_struct *mm, unsigned long addr, pmd_t *pmdp, pm
extern void pgd_init(void *addr);
extern void pud_init(void *addr);
extern void pmd_init(void *addr);
extern void kernel_pte_init(void *addr);
/*
* Encode/decode swap entries and swap PTEs. Swap PTEs are all PTEs that
@ -325,39 +326,17 @@ static inline void set_pte(pte_t *ptep, pte_t pteval)
{
WRITE_ONCE(*ptep, pteval);
if (pte_val(pteval) & _PAGE_GLOBAL) {
pte_t *buddy = ptep_buddy(ptep);
/*
* Make sure the buddy is global too (if it's !none,
* it better already be global)
*/
if (pte_none(ptep_get(buddy))) {
#ifdef CONFIG_SMP
/*
* For SMP, multiple CPUs can race, so we need
* to do this atomically.
*/
__asm__ __volatile__(
__AMOR "$zero, %[global], %[buddy] \n"
: [buddy] "+ZB" (buddy->pte)
: [global] "r" (_PAGE_GLOBAL)
: "memory");
DBAR(0b11000); /* o_wrw = 0b11000 */
#else /* !CONFIG_SMP */
WRITE_ONCE(*buddy, __pte(pte_val(ptep_get(buddy)) | _PAGE_GLOBAL));
#endif /* CONFIG_SMP */
}
}
if (pte_val(pteval) & _PAGE_GLOBAL)
DBAR(0b11000); /* o_wrw = 0b11000 */
#endif
}
static inline void pte_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep)
{
/* Preserve global status for the pair */
if (pte_val(ptep_get(ptep_buddy(ptep))) & _PAGE_GLOBAL)
set_pte(ptep, __pte(_PAGE_GLOBAL));
else
set_pte(ptep, __pte(0));
pte_t pte = ptep_get(ptep);
pte_val(pte) &= _PAGE_GLOBAL;
set_pte(ptep, pte);
}
#define PGD_T_LOG2 (__builtin_ffs(sizeof(pgd_t)) - 1)

View File

@ -293,13 +293,15 @@ unsigned long stack_top(void)
{
unsigned long top = TASK_SIZE & PAGE_MASK;
/* Space for the VDSO & data page */
top -= PAGE_ALIGN(current->thread.vdso->size);
top -= VVAR_SIZE;
if (current->thread.vdso) {
/* Space for the VDSO & data page */
top -= PAGE_ALIGN(current->thread.vdso->size);
top -= VVAR_SIZE;
/* Space to randomize the VDSO base */
if (current->flags & PF_RANDOMIZE)
top -= VDSO_RANDOMIZE_SIZE;
/* Space to randomize the VDSO base */
if (current->flags & PF_RANDOMIZE)
top -= VDSO_RANDOMIZE_SIZE;
}
return top;
}

View File

@ -55,6 +55,7 @@
#define SMBIOS_FREQHIGH_OFFSET 0x17
#define SMBIOS_FREQLOW_MASK 0xFF
#define SMBIOS_CORE_PACKAGE_OFFSET 0x23
#define SMBIOS_THREAD_PACKAGE_OFFSET 0x25
#define LOONGSON_EFI_ENABLE (1 << 3)
unsigned long fw_arg0, fw_arg1, fw_arg2;
@ -125,7 +126,7 @@ static void __init parse_cpu_table(const struct dmi_header *dm)
cpu_clock_freq = freq_temp * 1000000;
loongson_sysconf.cpuname = (void *)dmi_string_parse(dm, dmi_data[16]);
loongson_sysconf.cores_per_package = *(dmi_data + SMBIOS_CORE_PACKAGE_OFFSET);
loongson_sysconf.cores_per_package = *(dmi_data + SMBIOS_THREAD_PACKAGE_OFFSET);
pr_info("CpuClock = %llu\n", cpu_clock_freq);
}

View File

@ -555,6 +555,9 @@ asmlinkage void noinstr do_ale(struct pt_regs *regs)
#else
unsigned int *pc;
if (regs->csr_prmd & CSR_PRMD_PIE)
local_irq_enable();
perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, regs->csr_badvaddr);
/*
@ -579,6 +582,8 @@ asmlinkage void noinstr do_ale(struct pt_regs *regs)
die_if_kernel("Kernel ale access", regs);
force_sig_fault(SIGBUS, BUS_ADRALN, (void __user *)regs->csr_badvaddr);
out:
if (regs->csr_prmd & CSR_PRMD_PIE)
local_irq_disable();
#endif
irqentry_exit(regs, state);
}

View File

@ -34,7 +34,6 @@ static union {
struct loongarch_vdso_data vdata;
} loongarch_vdso_data __page_aligned_data;
static struct page *vdso_pages[] = { NULL };
struct vdso_data *vdso_data = generic_vdso_data.data;
struct vdso_pcpu_data *vdso_pdata = loongarch_vdso_data.vdata.pdata;
struct vdso_rng_data *vdso_rng_data = &loongarch_vdso_data.vdata.rng_data;
@ -85,10 +84,8 @@ static vm_fault_t vvar_fault(const struct vm_special_mapping *sm,
struct loongarch_vdso_info vdso_info = {
.vdso = vdso_start,
.size = PAGE_SIZE,
.code_mapping = {
.name = "[vdso]",
.pages = vdso_pages,
.mremap = vdso_mremap,
},
.data_mapping = {
@ -103,11 +100,14 @@ static int __init init_vdso(void)
unsigned long i, cpu, pfn;
BUG_ON(!PAGE_ALIGNED(vdso_info.vdso));
BUG_ON(!PAGE_ALIGNED(vdso_info.size));
for_each_possible_cpu(cpu)
vdso_pdata[cpu].node = cpu_to_node(cpu);
vdso_info.size = PAGE_ALIGN(vdso_end - vdso_start);
vdso_info.code_mapping.pages =
kcalloc(vdso_info.size / PAGE_SIZE, sizeof(struct page *), GFP_KERNEL);
pfn = __phys_to_pfn(__pa_symbol(vdso_info.vdso));
for (i = 0; i < vdso_info.size / PAGE_SIZE; i++)
vdso_info.code_mapping.pages[i] = pfn_to_page(pfn + i);

View File

@ -161,10 +161,11 @@ static void _kvm_save_timer(struct kvm_vcpu *vcpu)
if (kvm_vcpu_is_blocking(vcpu)) {
/*
* HRTIMER_MODE_PINNED is suggested since vcpu may run in
* the same physical cpu in next time
* HRTIMER_MODE_PINNED_HARD is suggested since vcpu may run in
* the same physical cpu in next time, and the timer should run
* in hardirq context even in the PREEMPT_RT case.
*/
hrtimer_start(&vcpu->arch.swtimer, expire, HRTIMER_MODE_ABS_PINNED);
hrtimer_start(&vcpu->arch.swtimer, expire, HRTIMER_MODE_ABS_PINNED_HARD);
}
}

View File

@ -1457,7 +1457,7 @@ int kvm_arch_vcpu_create(struct kvm_vcpu *vcpu)
vcpu->arch.vpid = 0;
vcpu->arch.flush_gpa = INVALID_GPA;
hrtimer_init(&vcpu->arch.swtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED);
hrtimer_init(&vcpu->arch.swtimer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED_HARD);
vcpu->arch.swtimer.function = kvm_swtimer_wakeup;
vcpu->arch.handle_exit = kvm_handle_exit;

View File

@ -201,7 +201,9 @@ pte_t * __init populate_kernel_pte(unsigned long addr)
pte = memblock_alloc(PAGE_SIZE, PAGE_SIZE);
if (!pte)
panic("%s: Failed to allocate memory\n", __func__);
pmd_populate_kernel(&init_mm, pmd, pte);
kernel_pte_init(pte);
}
return pte_offset_kernel(pmd, addr);

View File

@ -116,6 +116,26 @@ void pud_init(void *addr)
EXPORT_SYMBOL_GPL(pud_init);
#endif
void kernel_pte_init(void *addr)
{
unsigned long *p, *end;
p = (unsigned long *)addr;
end = p + PTRS_PER_PTE;
do {
p[0] = _PAGE_GLOBAL;
p[1] = _PAGE_GLOBAL;
p[2] = _PAGE_GLOBAL;
p[3] = _PAGE_GLOBAL;
p[4] = _PAGE_GLOBAL;
p += 8;
p[-3] = _PAGE_GLOBAL;
p[-2] = _PAGE_GLOBAL;
p[-1] = _PAGE_GLOBAL;
} while (p != end);
}
pmd_t mk_pmd(struct page *page, pgprot_t prot)
{
pmd_t pmd;

View File

@ -102,3 +102,4 @@ unsigned long __cmpxchg_small(volatile void *ptr, unsigned long old,
return old;
}
}
EXPORT_SYMBOL(__cmpxchg_small);

View File

@ -282,6 +282,7 @@ int __init opal_event_init(void)
name, NULL);
if (rc) {
pr_warn("Error %d requesting OPAL irq %d\n", rc, (int)r->start);
kfree(name);
continue;
}
}

View File

@ -177,7 +177,7 @@ config RISCV
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RETHOOK if !XIP_KERNEL
select HAVE_RSEQ
select HAVE_RUST if RUSTC_SUPPORTS_RISCV
select HAVE_RUST if RUSTC_SUPPORTS_RISCV && CC_IS_CLANG
select HAVE_SAMPLE_FTRACE_DIRECT
select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
select HAVE_STACKPROTECTOR

View File

@ -2,6 +2,12 @@ ifdef CONFIG_RELOCATABLE
KBUILD_CFLAGS += -fno-pie
endif
ifdef CONFIG_RISCV_ALTERNATIVE_EARLY
ifdef CONFIG_FORTIFY_SOURCE
KBUILD_CFLAGS += -D__NO_FORTIFY
endif
endif
obj-$(CONFIG_ERRATA_ANDES) += andes/
obj-$(CONFIG_ERRATA_SIFIVE) += sifive/
obj-$(CONFIG_ERRATA_THEAD) += thead/

View File

@ -36,6 +36,11 @@ KASAN_SANITIZE_alternative.o := n
KASAN_SANITIZE_cpufeature.o := n
KASAN_SANITIZE_sbi_ecall.o := n
endif
ifdef CONFIG_FORTIFY_SOURCE
CFLAGS_alternative.o += -D__NO_FORTIFY
CFLAGS_cpufeature.o += -D__NO_FORTIFY
CFLAGS_sbi_ecall.o += -D__NO_FORTIFY
endif
endif
extra-y += vmlinux.lds

View File

@ -210,7 +210,7 @@ void __init __iomem *__acpi_map_table(unsigned long phys, unsigned long size)
if (!size)
return NULL;
return early_ioremap(phys, size);
return early_memremap(phys, size);
}
void __init __acpi_unmap_table(void __iomem *map, unsigned long size)
@ -218,7 +218,7 @@ void __init __acpi_unmap_table(void __iomem *map, unsigned long size)
if (!map || !size)
return;
early_iounmap(map, size);
early_memunmap(map, size);
}
void __iomem *acpi_os_ioremap(acpi_physical_address phys, acpi_size size)

View File

@ -4,8 +4,6 @@
* Copyright (C) 2017 SiFive
*/
#define GENERATING_ASM_OFFSETS
#include <linux/kbuild.h>
#include <linux/mm.h>
#include <linux/sched.h>

View File

@ -80,8 +80,7 @@ int populate_cache_leaves(unsigned int cpu)
{
struct cpu_cacheinfo *this_cpu_ci = get_cpu_cacheinfo(cpu);
struct cacheinfo *this_leaf = this_cpu_ci->info_list;
struct device_node *np = of_cpu_device_node_get(cpu);
struct device_node *prev = NULL;
struct device_node *np, *prev;
int levels = 1, level = 1;
if (!acpi_disabled) {
@ -105,6 +104,10 @@ int populate_cache_leaves(unsigned int cpu)
return 0;
}
np = of_cpu_device_node_get(cpu);
if (!np)
return -ENOENT;
if (of_property_read_bool(np, "cache-size"))
ci_leaf_init(this_leaf++, CACHE_TYPE_UNIFIED, level);
if (of_property_read_bool(np, "i-cache-size"))

View File

@ -58,7 +58,7 @@ void arch_cpuhp_cleanup_dead_cpu(unsigned int cpu)
if (cpu_ops->cpu_is_stopped)
ret = cpu_ops->cpu_is_stopped(cpu);
if (ret)
pr_warn("CPU%d may not have stopped: %d\n", cpu, ret);
pr_warn("CPU%u may not have stopped: %d\n", cpu, ret);
}
/*

View File

@ -64,7 +64,7 @@ extra_header_fields:
.long efi_header_end - _start // SizeOfHeaders
.long 0 // CheckSum
.short IMAGE_SUBSYSTEM_EFI_APPLICATION // Subsystem
.short 0 // DllCharacteristics
.short IMAGE_DLL_CHARACTERISTICS_NX_COMPAT // DllCharacteristics
.quad 0 // SizeOfStackReserve
.quad 0 // SizeOfStackCommit
.quad 0 // SizeOfHeapReserve

View File

@ -16,8 +16,12 @@ KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO), $(KBUILD_CFLAGS))
KBUILD_CFLAGS += -mcmodel=medany
CFLAGS_cmdline_early.o += -D__NO_FORTIFY
CFLAGS_lib-fdt_ro.o += -D__NO_FORTIFY
CFLAGS_fdt_early.o += -D__NO_FORTIFY
# lib/string.c already defines __NO_FORTIFY
CFLAGS_ctype.o += -D__NO_FORTIFY
CFLAGS_lib-fdt.o += -D__NO_FORTIFY
CFLAGS_lib-fdt_ro.o += -D__NO_FORTIFY
CFLAGS_archrandom_early.o += -D__NO_FORTIFY
$(obj)/%.pi.o: OBJCOPYFLAGS := --prefix-symbols=__pi_ \
--remove-section=.note.gnu.property \

View File

@ -136,8 +136,6 @@
#define REG_PTR(insn, pos, regs) \
(ulong *)((ulong)(regs) + REG_OFFSET(insn, pos))
#define GET_RM(insn) (((insn) >> 12) & 7)
#define GET_RS1(insn, regs) (*REG_PTR(insn, SH_RS1, regs))
#define GET_RS2(insn, regs) (*REG_PTR(insn, SH_RS2, regs))
#define GET_RS1S(insn, regs) (*REG_PTR(RVC_RS1S(insn), 0, regs))

View File

@ -18,6 +18,7 @@ obj-vdso = $(patsubst %, %.o, $(vdso-syms)) note.o
ccflags-y := -fno-stack-protector
ccflags-y += -DDISABLE_BRANCH_PROFILING
ccflags-y += -fno-builtin
ifneq ($(c-gettimeofday-y),)
CFLAGS_vgettimeofday.o += -fPIC -include $(c-gettimeofday-y)

View File

@ -55,7 +55,7 @@ struct imsic {
/* IMSIC SW-file */
struct imsic_mrif *swfile;
phys_addr_t swfile_pa;
spinlock_t swfile_extirq_lock;
raw_spinlock_t swfile_extirq_lock;
};
#define imsic_vs_csr_read(__c) \
@ -622,7 +622,7 @@ static void imsic_swfile_extirq_update(struct kvm_vcpu *vcpu)
* interruptions between reading topei and updating pending status.
*/
spin_lock_irqsave(&imsic->swfile_extirq_lock, flags);
raw_spin_lock_irqsave(&imsic->swfile_extirq_lock, flags);
if (imsic_mrif_atomic_read(mrif, &mrif->eidelivery) &&
imsic_mrif_topei(mrif, imsic->nr_eix, imsic->nr_msis))
@ -630,7 +630,7 @@ static void imsic_swfile_extirq_update(struct kvm_vcpu *vcpu)
else
kvm_riscv_vcpu_unset_interrupt(vcpu, IRQ_VS_EXT);
spin_unlock_irqrestore(&imsic->swfile_extirq_lock, flags);
raw_spin_unlock_irqrestore(&imsic->swfile_extirq_lock, flags);
}
static void imsic_swfile_read(struct kvm_vcpu *vcpu, bool clear,
@ -1051,7 +1051,7 @@ int kvm_riscv_vcpu_aia_imsic_init(struct kvm_vcpu *vcpu)
}
imsic->swfile = page_to_virt(swfile_page);
imsic->swfile_pa = page_to_phys(swfile_page);
spin_lock_init(&imsic->swfile_extirq_lock);
raw_spin_lock_init(&imsic->swfile_extirq_lock);
/* Setup IO device */
kvm_iodevice_init(&imsic->iodev, &imsic_iodoev_ops);

View File

@ -18,6 +18,7 @@
#define RV_MAX_REG_ARGS 8
#define RV_FENTRY_NINSNS 2
#define RV_FENTRY_NBYTES (RV_FENTRY_NINSNS * 4)
#define RV_KCFI_NINSNS (IS_ENABLED(CONFIG_CFI_CLANG) ? 1 : 0)
/* imm that allows emit_imm to emit max count insns */
#define RV_MAX_COUNT_IMM 0x7FFF7FF7FF7FF7FF
@ -271,7 +272,8 @@ static void __build_epilogue(bool is_tail_call, struct rv_jit_context *ctx)
if (!is_tail_call)
emit_addiw(RV_REG_A0, RV_REG_A5, 0, ctx);
emit_jalr(RV_REG_ZERO, is_tail_call ? RV_REG_T3 : RV_REG_RA,
is_tail_call ? (RV_FENTRY_NINSNS + 1) * 4 : 0, /* skip reserved nops and TCC init */
/* kcfi, fentry and TCC init insns will be skipped on tailcall */
is_tail_call ? (RV_KCFI_NINSNS + RV_FENTRY_NINSNS + 1) * 4 : 0,
ctx);
}
@ -548,8 +550,8 @@ static void emit_atomic(u8 rd, u8 rs, s16 off, s32 imm, bool is64,
rv_lr_w(r0, 0, rd, 0, 0), ctx);
jmp_offset = ninsns_rvoff(8);
emit(rv_bne(RV_REG_T2, r0, jmp_offset >> 1), ctx);
emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 0) :
rv_sc_w(RV_REG_T3, rs, rd, 0, 0), ctx);
emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 1) :
rv_sc_w(RV_REG_T3, rs, rd, 0, 1), ctx);
jmp_offset = ninsns_rvoff(-6);
emit(rv_bne(RV_REG_T3, 0, jmp_offset >> 1), ctx);
emit(rv_fence(0x3, 0x3), ctx);

View File

@ -50,7 +50,6 @@ CONFIG_NUMA=y
CONFIG_HZ_100=y
CONFIG_CERT_STORE=y
CONFIG_EXPOLINE=y
# CONFIG_EXPOLINE_EXTERN is not set
CONFIG_EXPOLINE_AUTO=y
CONFIG_CHSC_SCH=y
CONFIG_VFIO_CCW=m
@ -95,6 +94,7 @@ CONFIG_BINFMT_MISC=m
CONFIG_ZSWAP=y
CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y
CONFIG_ZSMALLOC_STAT=y
CONFIG_SLAB_BUCKETS=y
CONFIG_SLUB_STATS=y
# CONFIG_COMPAT_BRK is not set
CONFIG_MEMORY_HOTPLUG=y
@ -426,6 +426,13 @@ CONFIG_DEVTMPFS_SAFE=y
# CONFIG_FW_LOADER is not set
CONFIG_CONNECTOR=y
CONFIG_ZRAM=y
CONFIG_ZRAM_BACKEND_LZ4=y
CONFIG_ZRAM_BACKEND_LZ4HC=y
CONFIG_ZRAM_BACKEND_ZSTD=y
CONFIG_ZRAM_BACKEND_DEFLATE=y
CONFIG_ZRAM_BACKEND_842=y
CONFIG_ZRAM_BACKEND_LZO=y
CONFIG_ZRAM_DEF_COMP_DEFLATE=y
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_DRBD=m
CONFIG_BLK_DEV_NBD=m
@ -486,6 +493,7 @@ CONFIG_DM_UEVENT=y
CONFIG_DM_FLAKEY=m
CONFIG_DM_VERITY=m
CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_PLATFORM_KEYRING=y
CONFIG_DM_SWITCH=m
CONFIG_DM_INTEGRITY=m
CONFIG_DM_VDO=m
@ -535,6 +543,7 @@ CONFIG_NLMON=m
CONFIG_MLX4_EN=m
CONFIG_MLX5_CORE=m
CONFIG_MLX5_CORE_EN=y
# CONFIG_NET_VENDOR_META is not set
# CONFIG_NET_VENDOR_MICREL is not set
# CONFIG_NET_VENDOR_MICROCHIP is not set
# CONFIG_NET_VENDOR_MICROSEMI is not set
@ -695,6 +704,7 @@ CONFIG_NFSD=m
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_V4_SECURITY_LABEL=y
# CONFIG_NFSD_LEGACY_CLIENT_TRACKING is not set
CONFIG_CIFS=m
CONFIG_CIFS_UPCALL=y
CONFIG_CIFS_XATTR=y
@ -740,7 +750,6 @@ CONFIG_CRYPTO_DH=m
CONFIG_CRYPTO_ECDH=m
CONFIG_CRYPTO_ECDSA=m
CONFIG_CRYPTO_ECRDSA=m
CONFIG_CRYPTO_SM2=m
CONFIG_CRYPTO_CURVE25519=m
CONFIG_CRYPTO_AES_TI=m
CONFIG_CRYPTO_ANUBIS=m

View File

@ -48,7 +48,6 @@ CONFIG_NUMA=y
CONFIG_HZ_100=y
CONFIG_CERT_STORE=y
CONFIG_EXPOLINE=y
# CONFIG_EXPOLINE_EXTERN is not set
CONFIG_EXPOLINE_AUTO=y
CONFIG_CHSC_SCH=y
CONFIG_VFIO_CCW=m
@ -89,6 +88,7 @@ CONFIG_BINFMT_MISC=m
CONFIG_ZSWAP=y
CONFIG_ZSWAP_ZPOOL_DEFAULT_ZBUD=y
CONFIG_ZSMALLOC_STAT=y
CONFIG_SLAB_BUCKETS=y
# CONFIG_COMPAT_BRK is not set
CONFIG_MEMORY_HOTPLUG=y
CONFIG_MEMORY_HOTREMOVE=y
@ -416,6 +416,13 @@ CONFIG_DEVTMPFS_SAFE=y
# CONFIG_FW_LOADER is not set
CONFIG_CONNECTOR=y
CONFIG_ZRAM=y
CONFIG_ZRAM_BACKEND_LZ4=y
CONFIG_ZRAM_BACKEND_LZ4HC=y
CONFIG_ZRAM_BACKEND_ZSTD=y
CONFIG_ZRAM_BACKEND_DEFLATE=y
CONFIG_ZRAM_BACKEND_842=y
CONFIG_ZRAM_BACKEND_LZO=y
CONFIG_ZRAM_DEF_COMP_DEFLATE=y
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_DRBD=m
CONFIG_BLK_DEV_NBD=m
@ -476,6 +483,7 @@ CONFIG_DM_UEVENT=y
CONFIG_DM_FLAKEY=m
CONFIG_DM_VERITY=m
CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG=y
CONFIG_DM_VERITY_VERIFY_ROOTHASH_SIG_PLATFORM_KEYRING=y
CONFIG_DM_SWITCH=m
CONFIG_DM_INTEGRITY=m
CONFIG_DM_VDO=m
@ -525,6 +533,7 @@ CONFIG_NLMON=m
CONFIG_MLX4_EN=m
CONFIG_MLX5_CORE=m
CONFIG_MLX5_CORE_EN=y
# CONFIG_NET_VENDOR_META is not set
# CONFIG_NET_VENDOR_MICREL is not set
# CONFIG_NET_VENDOR_MICROCHIP is not set
# CONFIG_NET_VENDOR_MICROSEMI is not set
@ -682,6 +691,7 @@ CONFIG_NFSD=m
CONFIG_NFSD_V3_ACL=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_V4_SECURITY_LABEL=y
# CONFIG_NFSD_LEGACY_CLIENT_TRACKING is not set
CONFIG_CIFS=m
CONFIG_CIFS_UPCALL=y
CONFIG_CIFS_XATTR=y
@ -726,7 +736,6 @@ CONFIG_CRYPTO_DH=m
CONFIG_CRYPTO_ECDH=m
CONFIG_CRYPTO_ECDSA=m
CONFIG_CRYPTO_ECRDSA=m
CONFIG_CRYPTO_SM2=m
CONFIG_CRYPTO_CURVE25519=m
CONFIG_CRYPTO_AES_TI=m
CONFIG_CRYPTO_ANUBIS=m
@ -767,6 +776,7 @@ CONFIG_CRYPTO_LZ4=m
CONFIG_CRYPTO_LZ4HC=m
CONFIG_CRYPTO_ZSTD=m
CONFIG_CRYPTO_ANSI_CPRNG=m
CONFIG_CRYPTO_JITTERENTROPY_OSR=1
CONFIG_CRYPTO_USER_API_HASH=m
CONFIG_CRYPTO_USER_API_SKCIPHER=m
CONFIG_CRYPTO_USER_API_RNG=m

View File

@ -49,6 +49,7 @@ CONFIG_ZFCP=y
# CONFIG_HVC_IUCV is not set
# CONFIG_HW_RANDOM_S390 is not set
# CONFIG_HMC_DRV is not set
# CONFIG_S390_UV_UAPI is not set
# CONFIG_S390_TAPE is not set
# CONFIG_VMCP is not set
# CONFIG_MONWRITER is not set

View File

@ -49,6 +49,7 @@ struct perf_sf_sde_regs {
};
#define perf_arch_fetch_caller_regs(regs, __ip) do { \
(regs)->psw.mask = 0; \
(regs)->psw.addr = (__ip); \
(regs)->gprs[15] = (unsigned long)__builtin_frame_address(0) - \
offsetof(struct stack_frame, back_chain); \

View File

@ -77,7 +77,7 @@ static int __diag_page_ref_service(struct kvm_vcpu *vcpu)
vcpu->stat.instruction_diagnose_258++;
if (vcpu->run->s.regs.gprs[rx] & 7)
return kvm_s390_inject_program_int(vcpu, PGM_SPECIFICATION);
rc = read_guest(vcpu, vcpu->run->s.regs.gprs[rx], rx, &parm, sizeof(parm));
rc = read_guest_real(vcpu, vcpu->run->s.regs.gprs[rx], &parm, sizeof(parm));
if (rc)
return kvm_s390_inject_prog_cond(vcpu, rc);
if (parm.parm_version != 2 || parm.parm_len < 5 || parm.code != 0x258)

View File

@ -828,6 +828,8 @@ static int access_guest_page(struct kvm *kvm, enum gacc_mode mode, gpa_t gpa,
const gfn_t gfn = gpa_to_gfn(gpa);
int rc;
if (!gfn_to_memslot(kvm, gfn))
return PGM_ADDRESSING;
if (mode == GACC_STORE)
rc = kvm_write_guest_page(kvm, gfn, data, offset, len);
else
@ -985,6 +987,8 @@ int access_guest_real(struct kvm_vcpu *vcpu, unsigned long gra,
gra += fragment_len;
data += fragment_len;
}
if (rc > 0)
vcpu->arch.pgm.code = rc;
return rc;
}

View File

@ -405,11 +405,12 @@ int read_guest_abs(struct kvm_vcpu *vcpu, unsigned long gpa, void *data,
* @len: number of bytes to copy
*
* Copy @len bytes from @data (kernel space) to @gra (guest real address).
* It is up to the caller to ensure that the entire guest memory range is
* valid memory before calling this function.
* Guest low address and key protection are not checked.
*
* Returns zero on success or -EFAULT on error.
* Returns zero on success, -EFAULT when copying from @data failed, or
* PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info
* is also stored to allow injecting into the guest (if applicable) using
* kvm_s390_inject_prog_cond().
*
* If an error occurs data may have been copied partially to guest memory.
*/
@ -428,11 +429,12 @@ int write_guest_real(struct kvm_vcpu *vcpu, unsigned long gra, void *data,
* @len: number of bytes to copy
*
* Copy @len bytes from @gra (guest real address) to @data (kernel space).
* It is up to the caller to ensure that the entire guest memory range is
* valid memory before calling this function.
* Guest key protection is not checked.
*
* Returns zero on success or -EFAULT on error.
* Returns zero on success, -EFAULT when copying to @data failed, or
* PGM_ADRESSING in case @gra is outside a memslot. In this case, pgm check info
* is also stored to allow injecting into the guest (if applicable) using
* kvm_s390_inject_prog_cond().
*
* If an error occurs data may have been copied partially to kernel space.
*/

View File

@ -280,18 +280,19 @@ static void __zpci_event_error(struct zpci_ccdf_err *ccdf)
goto no_pdev;
switch (ccdf->pec) {
case 0x003a: /* Service Action or Error Recovery Successful */
case 0x002a: /* Error event concerns FMB */
case 0x002b:
case 0x002c:
break;
case 0x0040: /* Service Action or Error Recovery Failed */
case 0x003b:
zpci_event_io_failure(pdev, pci_channel_io_perm_failure);
break;
default: /* PCI function left in the error state attempt to recover */
ers_res = zpci_event_attempt_error_recovery(pdev);
if (ers_res != PCI_ERS_RESULT_RECOVERED)
zpci_event_io_failure(pdev, pci_channel_io_perm_failure);
break;
default:
/*
* Mark as frozen not permanently failed because the device
* could be subsequently recovered by the platform.
*/
zpci_event_io_failure(pdev, pci_channel_io_frozen);
break;
}
pci_dev_put(pdev);
no_pdev:

View File

@ -2257,6 +2257,7 @@ config RANDOMIZE_MEMORY_PHYSICAL_PADDING
config ADDRESS_MASKING
bool "Linear Address Masking support"
depends on X86_64
depends on COMPILE_TEST || !CPU_MITIGATIONS # wait for LASS
help
Linear Address Masking (LAM) modifies the checking that is applied
to 64-bit linear addresses, allowing software to use of the

View File

@ -9,6 +9,8 @@
#include <asm/unwind_hints.h>
#include <asm/segment.h>
#include <asm/cache.h>
#include <asm/cpufeatures.h>
#include <asm/nospec-branch.h>
#include "calling.h"
@ -19,6 +21,9 @@ SYM_FUNC_START(entry_ibpb)
movl $PRED_CMD_IBPB, %eax
xorl %edx, %edx
wrmsr
/* Make sure IBPB clears return stack preductions too. */
FILL_RETURN_BUFFER %rax, RSB_CLEAR_LOOPS, X86_BUG_IBPB_NO_RET
RET
SYM_FUNC_END(entry_ibpb)
/* For KVM */

View File

@ -871,6 +871,8 @@ SYM_FUNC_START(entry_SYSENTER_32)
/* Now ready to switch the cr3 */
SWITCH_TO_USER_CR3 scratch_reg=%eax
/* Clobbers ZF */
CLEAR_CPU_BUFFERS
/*
* Restore all flags except IF. (We restore IF separately because
@ -881,7 +883,6 @@ SYM_FUNC_START(entry_SYSENTER_32)
BUG_IF_WRONG_CR3 no_user_check=1
popfl
popl %eax
CLEAR_CPU_BUFFERS
/*
* Return back to the vDSO, which will pop ecx and edx.
@ -1144,7 +1145,6 @@ SYM_CODE_START(asm_exc_nmi)
/* Not on SYSENTER stack. */
call exc_nmi
CLEAR_CPU_BUFFERS
jmp .Lnmi_return
.Lnmi_from_sysenter_stack:
@ -1165,6 +1165,7 @@ SYM_CODE_START(asm_exc_nmi)
CHECK_AND_APPLY_ESPFIX
RESTORE_ALL_NMI cr3_reg=%edi pop=4
CLEAR_CPU_BUFFERS
jmp .Lirq_return
#ifdef CONFIG_X86_ESPFIX32
@ -1206,6 +1207,7 @@ SYM_CODE_START(asm_exc_nmi)
* 1 - orig_ax
*/
lss (1+5+6)*4(%esp), %esp # back to espfix stack
CLEAR_CPU_BUFFERS
jmp .Lirq_return
#endif
SYM_CODE_END(asm_exc_nmi)

View File

@ -116,7 +116,10 @@ static inline bool amd_gart_present(void)
#define amd_nb_num(x) 0
#define amd_nb_has_feature(x) false
#define node_to_amd_nb(x) NULL
static inline struct amd_northbridge *node_to_amd_nb(int node)
{
return NULL;
}
#define amd_gart_present(x) false
#endif

View File

@ -215,7 +215,7 @@
#define X86_FEATURE_SPEC_STORE_BYPASS_DISABLE ( 7*32+23) /* Disable Speculative Store Bypass. */
#define X86_FEATURE_LS_CFG_SSBD ( 7*32+24) /* AMD SSBD implementation via LS_CFG MSR */
#define X86_FEATURE_IBRS ( 7*32+25) /* "ibrs" Indirect Branch Restricted Speculation */
#define X86_FEATURE_IBPB ( 7*32+26) /* "ibpb" Indirect Branch Prediction Barrier */
#define X86_FEATURE_IBPB ( 7*32+26) /* "ibpb" Indirect Branch Prediction Barrier without a guaranteed RSB flush */
#define X86_FEATURE_STIBP ( 7*32+27) /* "stibp" Single Thread Indirect Branch Predictors */
#define X86_FEATURE_ZEN ( 7*32+28) /* Generic flag for all Zen and newer */
#define X86_FEATURE_L1TF_PTEINV ( 7*32+29) /* L1TF workaround PTE inversion */
@ -348,6 +348,7 @@
#define X86_FEATURE_CPPC (13*32+27) /* "cppc" Collaborative Processor Performance Control */
#define X86_FEATURE_AMD_PSFD (13*32+28) /* Predictive Store Forwarding Disable */
#define X86_FEATURE_BTC_NO (13*32+29) /* Not vulnerable to Branch Type Confusion */
#define X86_FEATURE_AMD_IBPB_RET (13*32+30) /* IBPB clears return address predictor */
#define X86_FEATURE_BRS (13*32+31) /* "brs" Branch Sampling available */
/* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */
@ -523,4 +524,5 @@
#define X86_BUG_DIV0 X86_BUG(1*32 + 1) /* "div0" AMD DIV0 speculation bug */
#define X86_BUG_RFDS X86_BUG(1*32 + 2) /* "rfds" CPU is vulnerable to Register File Data Sampling */
#define X86_BUG_BHI X86_BUG(1*32 + 3) /* "bhi" CPU is affected by Branch History Injection */
#define X86_BUG_IBPB_NO_RET X86_BUG(1*32 + 4) /* "ibpb_no_ret" IBPB omits return target predictions */
#endif /* _ASM_X86_CPUFEATURES_H */

View File

@ -323,7 +323,16 @@
* Note: Only the memory operand variant of VERW clears the CPU buffers.
*/
.macro CLEAR_CPU_BUFFERS
ALTERNATIVE "", __stringify(verw _ASM_RIP(mds_verw_sel)), X86_FEATURE_CLEAR_CPU_BUF
#ifdef CONFIG_X86_64
ALTERNATIVE "", "verw mds_verw_sel(%rip)", X86_FEATURE_CLEAR_CPU_BUF
#else
/*
* In 32bit mode, the memory operand must be a %cs reference. The data
* segments may not be usable (vm86 mode), and the stack segment may not
* be flat (ESPFIX32).
*/
ALTERNATIVE "", "verw %cs:mds_verw_sel", X86_FEATURE_CLEAR_CPU_BUF
#endif
.endm
#ifdef CONFIG_X86_64

View File

@ -6,7 +6,7 @@
typeof(sym) __ret; \
asm_inline("mov %1,%0\n1:\n" \
".pushsection runtime_ptr_" #sym ",\"a\"\n\t" \
".long 1b - %c2 - .\n\t" \
".long 1b - %c2 - .\n" \
".popsection" \
:"=r" (__ret) \
:"i" ((unsigned long)0x0123456789abcdefull), \
@ -20,7 +20,7 @@
typeof(0u+(val)) __ret = (val); \
asm_inline("shrl $12,%k0\n1:\n" \
".pushsection runtime_shift_" #sym ",\"a\"\n\t" \
".long 1b - 1 - .\n\t" \
".long 1b - 1 - .\n" \
".popsection" \
:"+r" (__ret)); \
__ret; })

View File

@ -12,6 +12,13 @@
#include <asm/cpufeatures.h>
#include <asm/page.h>
#include <asm/percpu.h>
#include <asm/runtime-const.h>
/*
* Virtual variable: there's no actual backing store for this,
* it can purely be used as 'runtime_const_ptr(USER_PTR_MAX)'
*/
extern unsigned long USER_PTR_MAX;
#ifdef CONFIG_ADDRESS_MASKING
/*
@ -46,19 +53,24 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm,
#endif
/*
* The virtual address space space is logically divided into a kernel
* half and a user half. When cast to a signed type, user pointers
* are positive and kernel pointers are negative.
*/
#define valid_user_address(x) ((__force long)(x) >= 0)
#define valid_user_address(x) \
((__force unsigned long)(x) <= runtime_const_ptr(USER_PTR_MAX))
/*
* Masking the user address is an alternative to a conditional
* user_access_begin that can avoid the fencing. This only works
* for dense accesses starting at the address.
*/
#define mask_user_address(x) ((typeof(x))((long)(x)|((long)(x)>>63)))
static inline void __user *mask_user_address(const void __user *ptr)
{
unsigned long mask;
asm("cmp %1,%0\n\t"
"sbb %0,%0"
:"=r" (mask)
:"r" (ptr),
"0" (runtime_const_ptr(USER_PTR_MAX)));
return (__force void __user *)(mask | (__force unsigned long)ptr);
}
#define masked_user_access_begin(x) ({ \
__auto_type __masked_ptr = (x); \
__masked_ptr = mask_user_address(__masked_ptr); \
@ -69,23 +81,16 @@ static inline unsigned long __untagged_addr_remote(struct mm_struct *mm,
* arbitrary values in those bits rather then masking them off.
*
* Enforce two rules:
* 1. 'ptr' must be in the user half of the address space
* 1. 'ptr' must be in the user part of the address space
* 2. 'ptr+size' must not overflow into kernel addresses
*
* Note that addresses around the sign change are not valid addresses,
* and will GP-fault even with LAM enabled if the sign bit is set (see
* "CR3.LAM_SUP" that can narrow the canonicality check if we ever
* enable it, but not remove it entirely).
*
* So the "overflow into kernel addresses" does not imply some sudden
* exact boundary at the sign bit, and we can allow a lot of slop on the
* size check.
* Note that we always have at least one guard page between the
* max user address and the non-canonical gap, allowing us to
* ignore small sizes entirely.
*
* In fact, we could probably remove the size check entirely, since
* any kernel accesses will be in increasing address order starting
* at 'ptr', and even if the end might be in kernel space, we'll
* hit the GP faults for non-canonical accesses before we ever get
* there.
* at 'ptr'.
*
* That's a separate optimization, for now just handle the small
* constant case.

View File

@ -44,6 +44,7 @@
#define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4
#define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc
#define PCI_DEVICE_ID_AMD_1AH_M00H_DF_F4 0x12c4
#define PCI_DEVICE_ID_AMD_1AH_M20H_DF_F4 0x16fc
#define PCI_DEVICE_ID_AMD_1AH_M60H_DF_F4 0x124c
#define PCI_DEVICE_ID_AMD_1AH_M70H_DF_F4 0x12bc
#define PCI_DEVICE_ID_AMD_MI200_DF_F4 0x14d4
@ -127,6 +128,7 @@ static const struct pci_device_id amd_nb_link_ids[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F4) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CNB17H_F4) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M00H_DF_F4) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M20H_DF_F4) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M60H_DF_F4) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_1AH_M70H_DF_F4) },
{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_MI200_DF_F4) },

View File

@ -440,7 +440,19 @@ static int lapic_timer_shutdown(struct clock_event_device *evt)
v = apic_read(APIC_LVTT);
v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR);
apic_write(APIC_LVTT, v);
apic_write(APIC_TMICT, 0);
/*
* Setting APIC_LVT_MASKED (above) should be enough to tell
* the hardware that this timer will never fire. But AMD
* erratum 411 and some Intel CPU behavior circa 2024 say
* otherwise. Time for belt and suspenders programming: mask
* the timer _and_ zero the counter registers:
*/
if (v & APIC_LVT_TIMER_TSCDEADLINE)
wrmsrl(MSR_IA32_TSC_DEADLINE, 0);
else
apic_write(APIC_TMICT, 0);
return 0;
}

View File

@ -1202,5 +1202,6 @@ void amd_check_microcode(void)
if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD)
return;
on_each_cpu(zenbleed_check_cpu, NULL, 1);
if (cpu_feature_enabled(X86_FEATURE_ZEN2))
on_each_cpu(zenbleed_check_cpu, NULL, 1);
}

View File

@ -1115,8 +1115,25 @@ static void __init retbleed_select_mitigation(void)
case RETBLEED_MITIGATION_IBPB:
setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
/*
* IBPB on entry already obviates the need for
* software-based untraining so clear those in case some
* other mitigation like SRSO has selected them.
*/
setup_clear_cpu_cap(X86_FEATURE_UNRET);
setup_clear_cpu_cap(X86_FEATURE_RETHUNK);
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
mitigate_smt = true;
/*
* There is no need for RSB filling: entry_ibpb() ensures
* all predictions, including the RSB, are invalidated,
* regardless of IBPB implementation.
*/
setup_clear_cpu_cap(X86_FEATURE_RSB_VMEXIT);
break;
case RETBLEED_MITIGATION_STUFF:
@ -2627,6 +2644,14 @@ static void __init srso_select_mitigation(void)
if (has_microcode) {
setup_force_cpu_cap(X86_FEATURE_ENTRY_IBPB);
srso_mitigation = SRSO_MITIGATION_IBPB;
/*
* IBPB on entry already obviates the need for
* software-based untraining so clear those in case some
* other mitigation like Retbleed has selected them.
*/
setup_clear_cpu_cap(X86_FEATURE_UNRET);
setup_clear_cpu_cap(X86_FEATURE_RETHUNK);
}
} else {
pr_err("WARNING: kernel not compiled with MITIGATION_IBPB_ENTRY.\n");
@ -2638,6 +2663,13 @@ static void __init srso_select_mitigation(void)
if (!boot_cpu_has(X86_FEATURE_ENTRY_IBPB) && has_microcode) {
setup_force_cpu_cap(X86_FEATURE_IBPB_ON_VMEXIT);
srso_mitigation = SRSO_MITIGATION_IBPB_ON_VMEXIT;
/*
* There is no need for RSB filling: entry_ibpb() ensures
* all predictions, including the RSB, are invalidated,
* regardless of IBPB implementation.
*/
setup_clear_cpu_cap(X86_FEATURE_RSB_VMEXIT);
}
} else {
pr_err("WARNING: kernel not compiled with MITIGATION_SRSO.\n");

View File

@ -69,6 +69,7 @@
#include <asm/sev.h>
#include <asm/tdx.h>
#include <asm/posted_intr.h>
#include <asm/runtime-const.h>
#include "cpu.h"
@ -1443,6 +1444,9 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x86 *c)
boot_cpu_has(X86_FEATURE_HYPERVISOR)))
setup_force_cpu_bug(X86_BUG_BHI);
if (cpu_has(c, X86_FEATURE_AMD_IBPB) && !cpu_has(c, X86_FEATURE_AMD_IBPB_RET))
setup_force_cpu_bug(X86_BUG_IBPB_NO_RET);
if (cpu_matches(cpu_vuln_whitelist, NO_MELTDOWN))
return;
@ -2386,6 +2390,15 @@ void __init arch_cpu_finalize_init(void)
alternative_instructions();
if (IS_ENABLED(CONFIG_X86_64)) {
unsigned long USER_PTR_MAX = TASK_SIZE_MAX-1;
/*
* Enable this when LAM is gated on LASS support
if (cpu_feature_enabled(X86_FEATURE_LAM))
USER_PTR_MAX = (1ul << 63) - PAGE_SIZE - 1;
*/
runtime_const_init(ptr, USER_PTR_MAX);
/*
* Make sure the first 2MB area is not mapped by huge pages
* There are typically fixed size MTRRs in there and overlapping

View File

@ -584,7 +584,7 @@ void __init load_ucode_amd_bsp(struct early_load_data *ed, unsigned int cpuid_1_
native_rdmsr(MSR_AMD64_PATCH_LEVEL, ed->new_rev, dummy);
}
static enum ucode_state load_microcode_amd(u8 family, const u8 *data, size_t size);
static enum ucode_state _load_microcode_amd(u8 family, const u8 *data, size_t size);
static int __init save_microcode_in_initrd(void)
{
@ -605,7 +605,7 @@ static int __init save_microcode_in_initrd(void)
if (!desc.mc)
return -EINVAL;
ret = load_microcode_amd(x86_family(cpuid_1_eax), desc.data, desc.size);
ret = _load_microcode_amd(x86_family(cpuid_1_eax), desc.data, desc.size);
if (ret > UCODE_UPDATED)
return -EINVAL;
@ -613,16 +613,19 @@ static int __init save_microcode_in_initrd(void)
}
early_initcall(save_microcode_in_initrd);
static inline bool patch_cpus_equivalent(struct ucode_patch *p, struct ucode_patch *n)
static inline bool patch_cpus_equivalent(struct ucode_patch *p,
struct ucode_patch *n,
bool ignore_stepping)
{
/* Zen and newer hardcode the f/m/s in the patch ID */
if (x86_family(bsp_cpuid_1_eax) >= 0x17) {
union cpuid_1_eax p_cid = ucode_rev_to_cpuid(p->patch_id);
union cpuid_1_eax n_cid = ucode_rev_to_cpuid(n->patch_id);
/* Zap stepping */
p_cid.stepping = 0;
n_cid.stepping = 0;
if (ignore_stepping) {
p_cid.stepping = 0;
n_cid.stepping = 0;
}
return p_cid.full == n_cid.full;
} else {
@ -644,13 +647,13 @@ static struct ucode_patch *cache_find_patch(struct ucode_cpu_info *uci, u16 equi
WARN_ON_ONCE(!n.patch_id);
list_for_each_entry(p, &microcode_cache, plist)
if (patch_cpus_equivalent(p, &n))
if (patch_cpus_equivalent(p, &n, false))
return p;
return NULL;
}
static inline bool patch_newer(struct ucode_patch *p, struct ucode_patch *n)
static inline int patch_newer(struct ucode_patch *p, struct ucode_patch *n)
{
/* Zen and newer hardcode the f/m/s in the patch ID */
if (x86_family(bsp_cpuid_1_eax) >= 0x17) {
@ -659,6 +662,9 @@ static inline bool patch_newer(struct ucode_patch *p, struct ucode_patch *n)
zp.ucode_rev = p->patch_id;
zn.ucode_rev = n->patch_id;
if (zn.stepping != zp.stepping)
return -1;
return zn.rev > zp.rev;
} else {
return n->patch_id > p->patch_id;
@ -668,10 +674,14 @@ static inline bool patch_newer(struct ucode_patch *p, struct ucode_patch *n)
static void update_cache(struct ucode_patch *new_patch)
{
struct ucode_patch *p;
int ret;
list_for_each_entry(p, &microcode_cache, plist) {
if (patch_cpus_equivalent(p, new_patch)) {
if (!patch_newer(p, new_patch)) {
if (patch_cpus_equivalent(p, new_patch, true)) {
ret = patch_newer(p, new_patch);
if (ret < 0)
continue;
else if (!ret) {
/* we already have the latest patch */
kfree(new_patch->data);
kfree(new_patch);
@ -944,6 +954,20 @@ static enum ucode_state __load_microcode_amd(u8 family, const u8 *data,
return UCODE_OK;
}
static enum ucode_state _load_microcode_amd(u8 family, const u8 *data, size_t size)
{
enum ucode_state ret;
/* free old equiv table */
free_equiv_cpu_table();
ret = __load_microcode_amd(family, data, size);
if (ret != UCODE_OK)
cleanup();
return ret;
}
static enum ucode_state load_microcode_amd(u8 family, const u8 *data, size_t size)
{
struct cpuinfo_x86 *c;
@ -951,14 +975,9 @@ static enum ucode_state load_microcode_amd(u8 family, const u8 *data, size_t siz
struct ucode_patch *p;
enum ucode_state ret;
/* free old equiv table */
free_equiv_cpu_table();
ret = __load_microcode_amd(family, data, size);
if (ret != UCODE_OK) {
cleanup();
ret = _load_microcode_amd(family, data, size);
if (ret != UCODE_OK)
return ret;
}
for_each_node(nid) {
cpu = cpumask_first(cpumask_of_node(nid));

View File

@ -207,7 +207,7 @@ static inline bool rdt_get_mb_table(struct rdt_resource *r)
return false;
}
static bool __get_mem_config_intel(struct rdt_resource *r)
static __init bool __get_mem_config_intel(struct rdt_resource *r)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
union cpuid_0x10_3_eax eax;
@ -241,7 +241,7 @@ static bool __get_mem_config_intel(struct rdt_resource *r)
return true;
}
static bool __rdt_get_mem_config_amd(struct rdt_resource *r)
static __init bool __rdt_get_mem_config_amd(struct rdt_resource *r)
{
struct rdt_hw_resource *hw_res = resctrl_to_arch_res(r);
u32 eax, ebx, ecx, edx, subleaf;

Some files were not shown because too many files have changed in this diff Show More