mirror of
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
synced 2024-12-29 17:23:36 +00:00
100 lines
4.2 KiB
ReStructuredText
100 lines
4.2 KiB
ReStructuredText
|
.. SPDX-License-Identifier: GPL-2.0
|
||
|
|
||
|
:orphan:
|
||
|
|
||
|
==============
|
||
|
MSM Preemption
|
||
|
==============
|
||
|
|
||
|
Preemption allows Adreno GPUs to switch to a higher priority ring when work is
|
||
|
pushed to it, reducing latency for high priority submissions.
|
||
|
|
||
|
When preemption is enabled 4 rings are initialized, corresponding to different
|
||
|
priority levels. Having multiple rings is purely a software concept as the GPU
|
||
|
only has registers to keep track of one graphics ring.
|
||
|
The kernel is able to switch which ring is currently being processed by
|
||
|
requesting preemption. When certain conditions are met, depending on the
|
||
|
priority level, the GPU will save its current state in a series of buffers,
|
||
|
then restores state from a similar set of buffers specified by the kernel. It
|
||
|
then resumes execution and fires an IRQ to let the kernel know the context
|
||
|
switch has completed.
|
||
|
|
||
|
This mechanism can be used by the kernel to switch between rings. Whenever a
|
||
|
submission occurs the kernel finds the highest priority ring which isn't empty
|
||
|
and preempts to it if said ring is not the one being currently executed. This is
|
||
|
also done whenever a submission completes to make sure execution resumes on a
|
||
|
lower priority ring when a higher priority ring is done.
|
||
|
|
||
|
Preemption levels
|
||
|
-----------------
|
||
|
|
||
|
Preemption can only occur at certain boundaries. The exact conditions can be
|
||
|
configured by changing the preemption level, this allows to compromise between
|
||
|
latency (ie. the time that passes between when the kernel requests preemption
|
||
|
and when the SQE begins saving state) and overhead (the amount of state that
|
||
|
needs to be saved).
|
||
|
|
||
|
The GPU offers 3 levels:
|
||
|
|
||
|
Level 0
|
||
|
Preemption only occurs at the submission level. This requires the least amount
|
||
|
of state to be saved as the execution of userspace submitted IBs is never
|
||
|
interrupted, however it offers very little benefit compared to not enabling
|
||
|
preemption of any kind.
|
||
|
|
||
|
Level 1
|
||
|
Preemption occurs at either bin level, if using GMEM rendering, or draw level
|
||
|
in the sysmem rendering case.
|
||
|
|
||
|
Level 2
|
||
|
Preemption occurs at draw level.
|
||
|
|
||
|
Level 1 is the mode that is used by the msm driver.
|
||
|
|
||
|
Additionally the GPU allows to specify a `skip_save_restore` option. This
|
||
|
disables the saving and restoring of all registers except those relating to the
|
||
|
operation of the SQE itself, reducing overhead. Saving and restoring is only
|
||
|
skipped when using GMEM with Level 1 preemption. When enabling this userspace is
|
||
|
expected to set the state that isn't preserved whenever preemption occurs which
|
||
|
is done by specifying preamble and postambles. Those are IBs that are executed
|
||
|
before and after preemption.
|
||
|
|
||
|
Preemption buffers
|
||
|
------------------
|
||
|
|
||
|
A series of buffers are necessary to store the state of rings while they are not
|
||
|
being executed. There are different kinds of preemption records and most of
|
||
|
those require one buffer per ring. This is because preemption never occurs
|
||
|
between submissions on the same ring, which always run in sequence when the ring
|
||
|
is active. This means that only one context per ring is effectively active.
|
||
|
|
||
|
SMMU_INFO
|
||
|
This buffer contains info about the current SMMU configuration such as the
|
||
|
ttbr0 register. The SQE firmware isn't actually able to save this record.
|
||
|
As a result SMMU info must be saved manually from the CP to a buffer and the
|
||
|
SMMU record updated with info from said buffer before triggering
|
||
|
preemption.
|
||
|
|
||
|
NON_SECURE
|
||
|
This is the main preemption record where most state is saved. It is mostly
|
||
|
opaque to the kernel except for the first few words that must be initialized
|
||
|
by the kernel.
|
||
|
|
||
|
SECURE
|
||
|
This saves state related to the GPU's secure mode.
|
||
|
|
||
|
NON_PRIV
|
||
|
The intended purpose of this record is unknown. The SQE firmware actually
|
||
|
ignores it and therefore msm doesn't handle it.
|
||
|
|
||
|
COUNTER
|
||
|
This record is used to save and restore performance counters.
|
||
|
|
||
|
Handling the permissions of those buffers is critical for security. All but the
|
||
|
NON_PRIV records need to be inaccessible from userspace, so they must be mapped
|
||
|
in the kernel address space with the MSM_BO_MAP_PRIV flag.
|
||
|
For example, making the NON_SECURE record accessible from userspace would allow
|
||
|
any process to manipulate a saved ring's RPTR which can be used to skip the
|
||
|
execution of some packets in a ring and execute user commands with higher
|
||
|
privileges.
|