Merge tag 'drm-intel-next-2019-11-01-1' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

UAPI Changes:

- Make context persistence optional
  Allow userspace to tie the context lifetime to FD lifetime,
  effectively allowing Ctrl-C killing of a process to also clean
  up the hardware immediately.
  Compute changes: https://github.com/intel/compute-runtime/pull/228
  The compute driver is shipping in Ubuntu. uAPI acked by Mesa folks.

- Put future HW and their uAPIs under STAGING & BROKEN
  Introduces DRM_I915_UNSTABLE Kconfig menu for working on the new
  uAPI for future HW in upstream. We already disable driver loading
  by default the platform is deemed ready. This is a second level
  of protection based on compile time switch (STAGING & BROKEN).

- Under DRM_I915_UNSTABLE: Add the fake lmem region on iGFX
  Fake local memory region on integrated GPU through cmdline:
  memmap=2G$16G i915.fake_lmem_start=0x400000000
  Currently allows testing non-mappable GGTT behavior and running
  kernel selftest for local memory.

Driver Changes:

- Fix Bugzilla #112084: VGA external monitor not working (Ville)
- Add support for half float framebuffers (Ville)
- Add perf support on TGL (Lionel)
- Replace hangcheck by heartbeats (Chris)
- Allow SPT PCH on all AML devices (James)
- Add new CNL PCH for CML platform (Imre)
- Allow 100 ms (Kconfig) for workloads to exit before reset (Chris, Jon, Joonas)
- Forcibly pre-empt a context after 100 ms (Kconfig) of delay  (Chris)
- Make timeslice duration Kconfig configurable (Chris)
- Whitelist PS_(DEPTH|INVOCATION)_COUNT for Tigerlake (Tapani)
- Support creating LMEM objects in kernel (Matt A)
- Adjust the location of RING_MI_MODE in the context image for TGL (Chris)
- Handle AUX interrupts for TC ports (Matt R)
- Add support for devices without mappable GGTT aperture (Daniele)
- Rename "inject_load_failure" module parameter to "inject_probe_failure" (Janusz)
- Handle fused off HDCP, FBC, DMC and DSC (Jose)
- Add support to one DP-MST stream on Tigerlake (Lucas)
- Add HuC firmware (and GuC) for TGL (Daniele)
- Allow ICL+ DSI on any pipe (Ville)

- Check some transcoder timing minimum limits (Ville)
- Don't set queue_priority_hint if we don't kick the submission (Chris)
- Introduce barrier pulses along engines to flush idle/in-flight requests (Chris)
- Drop assertion that ce->pin_mutex guards state updates (Chris)
- Cancel banned contexts on schedule-out (Chris)
- Cancel contexts when hangchecking is disabled (Chris)
- Catch GTT fault errors for gen11+ planes (Matt R)
- Print in debugfs if PSR is not enabled because of sink (Jose)
- Do not set MOCS control values on dgfx (Lucas)
- Setup io-mapping for LMEM (Abdiel)
- Support kernel mapping of LMEM objects (Abdiel)
- Add LMEM selftests (Matt A)
- Initialise PMU spinlock before registering (Chris)
- Clear DKL_TX_PMD_LANE_SUS before program TC voltage swing (Jose)
- Flip interpretation of ips fmin/fmax to max rps (Chris)
- Add VBT compression parameter block definition (Jani)
- Limit the blitter sizes to ensure low preemption latency (Chris)
- Fixup block_size rounding on BLT (Matt A)
- Don't try to place HWS in non-existing mappable region (Michal Wa)
- Don't allocate the ring in stolen if we lack aperture (Matt A)
- Add AUX B & C to DC_OFF_POWER_DOMAINS for Tigerlake (Matt R)
- Avoid HPD poll detect triggering a new detect cycle (Imre)
- Document the userspace fail with possible_crtcs (Ville)
- Drop lrc header page now unused by GuC (Daniele)
- Do not switch aux to TBT mode for non-TC ports (Jose)

- Restructure code to avoid depending on i915 but smaller structs (Chris, Tvrtko, Andi)
- Remove pm park/unpark notifications (Chris)
- Avoid lockdep cross-contamination between object types (Chris)
- Restructure DSC code (Jani)
- Fix dead locking in early workload shadow (Zhenyu)
- Split the legacy submission backend from the common CS ring buffer (Chris)
- Move intel_engine_context_in/out into intel_lrc.c (Tvrtko)
- Describe perf/wakeref structure members in documentation (Anna)
- Update renamed header files names in documentation (Anna)
- Add debugs to distingiush a cd2x update from a full cdclk pll update (Ville)
- Rework atomic global state locking (Ville)
- Allow planes to declare their minimum acceptable cdclk (Ville)
- Eliminate skl_check_pipe_max_pixel_rate() and simplify skl_max_scale() (Ville)
- Making loglevel of PSR2/SU logs same (Ap)
- Capture aux page table error register (Lionel)
- Add is_dgfx to device info (Jose)
- Split gen11_irq_handler to make it shareable (Lucas)
- Encapsulate kconfig constant values inside boolean predicates (Chris)
- Split memory_region initialisation into its own file (Chris)
- Use _PICK() for CHICKEN_TRANS() and add CHICKEN_TRANS_D (Ville)
- Add perf helper macros for comparing with whitelisted registers (Umesh)
- Fix i915_inject_load_error() name to read *_probe_* (Janusz)
- Drop unused AUX register offsets (Matt R)
- Provide more information on DP AUX failures (Matt R)
- Add GAM/SFC instdone to error state (Mika)
- Always track callers to intel_rps_mark_interactive() (Chris)
- Nuke 'mode' argument to intel_get_load_detect_pipe() (Ville)
- Simplify LVDS crtc_mask and pipe_mask setup (Ville)
- Stop frobbing crtc->base.mode (Ville)
- Do s/crtc_mask/pipe_mask/ (Ville)
- Split detaching and removing the vma (Chris)

- Selftest improvements (Chris, Tvrtko, Mika, Matt A, Lionel)
- GuC code improvements (Rob, Andi, Daniele)

- Check against i915_selftest only under CONFIG_SELFTEST (Chris)
- Refine occupancy test in kill_context() (Chris)
- Start kthreads before stopping (Chris)

Signed-off-by: Dave Airlie <airlied@redhat.com>
From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20191101104718.GA14323@jlahtine-desk.ger.corp.intel.com
This commit is contained in:
Dave Airlie 2019-11-04 09:56:25 +10:00
commit 2ef4144d1e
172 changed files with 8787 additions and 5014 deletions

View File

@ -550,9 +550,9 @@ i915 Perf Stream
This section covers the stream-semantics-agnostic structures and functions
for representing an i915 perf stream FD and associated file operations.
.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
.. kernel-doc:: drivers/gpu/drm/i915/i915_perf_types.h
:functions: i915_perf_stream
.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
.. kernel-doc:: drivers/gpu/drm/i915/i915_perf_types.h
:functions: i915_perf_stream_ops
.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c
@ -577,7 +577,7 @@ for representing an i915 perf stream FD and associated file operations.
i915 Perf Observation Architecture Stream
-----------------------------------------
.. kernel-doc:: drivers/gpu/drm/i915/i915_drv.h
.. kernel-doc:: drivers/gpu/drm/i915/i915_perf_types.h
:functions: i915_oa_ops
.. kernel-doc:: drivers/gpu/drm/i915/i915_perf.c

View File

@ -148,3 +148,9 @@ menu "drm/i915 Profile Guided Optimisation"
depends on DRM_I915
source "drivers/gpu/drm/i915/Kconfig.profile"
endmenu
menu "drm/i915 Unstable Evolution"
visible if EXPERT && STAGING && BROKEN
depends on DRM_I915
source "drivers/gpu/drm/i915/Kconfig.unstable"
endmenu

View File

@ -36,6 +36,7 @@ config DRM_I915_DEBUG
select DRM_I915_SELFTEST
select DRM_I915_DEBUG_RUNTIME_PM
select DRM_I915_DEBUG_MMIO
select BROKEN # for prototype uAPI
default n
help
Choose this option to turn on extra driver debugging that may affect

View File

@ -12,6 +12,29 @@ config DRM_I915_USERFAULT_AUTOSUSPEND
May be 0 to disable the extra delay and solely use the device level
runtime pm autosuspend delay tunable.
config DRM_I915_HEARTBEAT_INTERVAL
int "Interval between heartbeat pulses (ms)"
default 2500 # milliseconds
help
The driver sends a periodic heartbeat down all active engines to
check the health of the GPU and undertake regular house-keeping of
internal driver state.
May be 0 to disable heartbeats and therefore disable automatic GPU
hang detection.
config DRM_I915_PREEMPT_TIMEOUT
int "Preempt timeout (ms, jiffy granularity)"
default 100 # milliseconds
help
How long to wait (in milliseconds) for a preemption event to occur
when submitting a new context via execlists. If the current context
does not hit an arbitration point and yield to HW before the timer
expires, the HW will be reset to allow the more important context
to execute.
May be 0 to disable the timeout.
config DRM_I915_SPIN_REQUEST
int "Busywait for request completion (us)"
default 5 # microseconds
@ -25,3 +48,29 @@ config DRM_I915_SPIN_REQUEST
May be 0 to disable the initial spin. In practice, we estimate
the cost of enabling the interrupt (if currently disabled) to be
a few microseconds.
config DRM_I915_STOP_TIMEOUT
int "How long to wait for an engine to quiesce gracefully before reset (ms)"
default 100 # milliseconds
help
By stopping submission and sleeping for a short time before resetting
the GPU, we allow the innocent contexts also on the system to quiesce.
It is then less likely for a hanging context to cause collateral
damage as the system is reset in order to recover. The corollary is
that the reset itself may take longer and so be more disruptive to
interactive or low latency workloads.
config DRM_I915_TIMESLICE_DURATION
int "Scheduling quantum for userspace batches (ms, jiffy granularity)"
default 1 # milliseconds
help
When two user batches of equal priority are executing, we will
alternate execution of each batch to ensure forward progress of
all users. This is necessary in some cases where there may be
an implicit dependency between those batches that requires
concurrent execution in order for them to proceed, e.g. they
interact with each other via userspace semaphores. Each context
is scheduled for execution for the timeslice duration, before
switching to the next context.
May be 0 to disable timeslicing.

View File

@ -0,0 +1,29 @@
# SPDX-License-Identifier: GPL-2.0-only
config DRM_I915_UNSTABLE
bool "Enable unstable API for early prototype development"
depends on EXPERT
depends on STAGING
depends on BROKEN # should never be enabled by distros!
# We use the dependency on !COMPILE_TEST to not be enabled in
# allmodconfig or allyesconfig configurations
depends on !COMPILE_TEST
default n
help
Enable prototype uAPI under general discussion before they are
finalized. Such prototypes may be withdrawn or substantially
changed before release. They are only enabled here so that a wide
number of interested parties (userspace driver developers) can
verify that the uAPI meet their expectations. These uAPI should
never be used in production.
Recommended for driver developers _only_.
If in the slightest bit of doubt, say "N".
config DRM_I915_UNSTABLE_FAKE_LMEM
bool "Enable the experimental fake lmem"
depends on DRM_I915_UNSTABLE
default n
help
Convert some system memory into a fake local memory region for
testing.

View File

@ -78,22 +78,24 @@ gt-y += \
gt/intel_breadcrumbs.o \
gt/intel_context.o \
gt/intel_engine_cs.o \
gt/intel_engine_pool.o \
gt/intel_engine_heartbeat.o \
gt/intel_engine_pm.o \
gt/intel_engine_pool.o \
gt/intel_engine_user.o \
gt/intel_gt.o \
gt/intel_gt_irq.o \
gt/intel_gt_pm.o \
gt/intel_gt_pm_irq.o \
gt/intel_gt_requests.o \
gt/intel_hangcheck.o \
gt/intel_llc.o \
gt/intel_lrc.o \
gt/intel_mocs.o \
gt/intel_rc6.o \
gt/intel_renderstate.o \
gt/intel_reset.o \
gt/intel_ringbuffer.o \
gt/intel_mocs.o \
gt/intel_ring.o \
gt/intel_ring_submission.o \
gt/intel_rps.o \
gt/intel_sseu.o \
gt/intel_timeline.o \
gt/intel_workarounds.o
@ -119,6 +121,7 @@ gem-y += \
gem/i915_gem_internal.o \
gem/i915_gem_object.o \
gem/i915_gem_object_blt.o \
gem/i915_gem_lmem.o \
gem/i915_gem_mman.o \
gem/i915_gem_pages.o \
gem/i915_gem_phys.o \
@ -147,6 +150,7 @@ i915-y += \
i915_scheduler.o \
i915_trace_points.o \
i915_vma.o \
intel_region_lmem.o \
intel_wopcm.o
# general-purpose microcontroller (GuC) support
@ -243,7 +247,8 @@ i915-y += \
oa/i915_oa_cflgt2.o \
oa/i915_oa_cflgt3.o \
oa/i915_oa_cnl.o \
oa/i915_oa_icl.o
oa/i915_oa_icl.o \
oa/i915_oa_tgl.o
i915-y += i915_perf.o
# Post-mortem debug and GPU hang state capture

View File

@ -1584,7 +1584,7 @@ void icl_dsi_init(struct drm_i915_private *dev_priv)
encoder->get_hw_state = gen11_dsi_get_hw_state;
encoder->type = INTEL_OUTPUT_DSI;
encoder->cloneable = 0;
encoder->crtc_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C);
encoder->pipe_mask = ~0;
encoder->power_domain = POWER_DOMAIN_PORT_DSI;
encoder->get_power_domains = gen11_dsi_get_power_domains;

View File

@ -429,6 +429,13 @@ void intel_atomic_state_clear(struct drm_atomic_state *s)
struct intel_atomic_state *state = to_intel_atomic_state(s);
drm_atomic_state_default_clear(&state->base);
state->dpll_set = state->modeset = false;
state->global_state_changed = false;
state->active_pipes = 0;
memset(&state->min_cdclk, 0, sizeof(state->min_cdclk));
memset(&state->min_voltage_level, 0, sizeof(state->min_voltage_level));
memset(&state->cdclk.logical, 0, sizeof(state->cdclk.logical));
memset(&state->cdclk.actual, 0, sizeof(state->cdclk.actual));
state->cdclk.pipe = INVALID_PIPE;
}
struct intel_crtc_state *
@ -442,3 +449,40 @@ intel_atomic_get_crtc_state(struct drm_atomic_state *state,
return to_intel_crtc_state(crtc_state);
}
int intel_atomic_lock_global_state(struct intel_atomic_state *state)
{
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
struct intel_crtc *crtc;
state->global_state_changed = true;
for_each_intel_crtc(&dev_priv->drm, crtc) {
int ret;
ret = drm_modeset_lock(&crtc->base.mutex,
state->base.acquire_ctx);
if (ret)
return ret;
}
return 0;
}
int intel_atomic_serialize_global_state(struct intel_atomic_state *state)
{
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
struct intel_crtc *crtc;
state->global_state_changed = true;
for_each_intel_crtc(&dev_priv->drm, crtc) {
struct intel_crtc_state *crtc_state;
crtc_state = intel_atomic_get_crtc_state(&state->base, crtc);
if (IS_ERR(crtc_state))
return PTR_ERR(crtc_state);
}
return 0;
}

View File

@ -16,6 +16,7 @@ struct drm_crtc_state;
struct drm_device;
struct drm_i915_private;
struct drm_property;
struct intel_atomic_state;
struct intel_crtc;
struct intel_crtc_state;
@ -46,4 +47,8 @@ int intel_atomic_setup_scalers(struct drm_i915_private *dev_priv,
struct intel_crtc *intel_crtc,
struct intel_crtc_state *crtc_state);
int intel_atomic_lock_global_state(struct intel_atomic_state *state);
int intel_atomic_serialize_global_state(struct intel_atomic_state *state);
#endif /* __INTEL_ATOMIC_H__ */

View File

@ -138,6 +138,44 @@ unsigned int intel_plane_data_rate(const struct intel_crtc_state *crtc_state,
return cpp * crtc_state->pixel_rate;
}
bool intel_plane_calc_min_cdclk(struct intel_atomic_state *state,
struct intel_plane *plane)
{
struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
const struct intel_plane_state *plane_state =
intel_atomic_get_new_plane_state(state, plane);
struct intel_crtc *crtc = to_intel_crtc(plane_state->base.crtc);
struct intel_crtc_state *crtc_state;
if (!plane_state->base.visible || !plane->min_cdclk)
return false;
crtc_state = intel_atomic_get_new_crtc_state(state, crtc);
crtc_state->min_cdclk[plane->id] =
plane->min_cdclk(crtc_state, plane_state);
/*
* Does the cdclk need to be bumbed up?
*
* Note: we obviously need to be called before the new
* cdclk frequency is calculated so state->cdclk.logical
* hasn't been populated yet. Hence we look at the old
* cdclk state under dev_priv->cdclk.logical. This is
* safe as long we hold at least one crtc mutex (which
* must be true since we have crtc_state).
*/
if (crtc_state->min_cdclk[plane->id] > dev_priv->cdclk.logical.cdclk) {
DRM_DEBUG_KMS("[PLANE:%d:%s] min_cdclk (%d kHz) > logical cdclk (%d kHz)\n",
plane->base.base.id, plane->base.name,
crtc_state->min_cdclk[plane->id],
dev_priv->cdclk.logical.cdclk);
return true;
}
return false;
}
int intel_plane_atomic_check_with_state(const struct intel_crtc_state *old_crtc_state,
struct intel_crtc_state *new_crtc_state,
const struct intel_plane_state *old_plane_state,
@ -151,6 +189,7 @@ int intel_plane_atomic_check_with_state(const struct intel_crtc_state *old_crtc_
new_crtc_state->nv12_planes &= ~BIT(plane->id);
new_crtc_state->c8_planes &= ~BIT(plane->id);
new_crtc_state->data_rate[plane->id] = 0;
new_crtc_state->min_cdclk[plane->id] = 0;
new_plane_state->base.visible = false;
if (!new_plane_state->base.crtc && !old_plane_state->base.crtc)

View File

@ -47,5 +47,7 @@ int intel_plane_atomic_calc_changes(const struct intel_crtc_state *old_crtc_stat
struct intel_crtc_state *crtc_state,
const struct intel_plane_state *old_plane_state,
struct intel_plane_state *plane_state);
bool intel_plane_calc_min_cdclk(struct intel_atomic_state *state,
struct intel_plane *plane);
#endif /* __INTEL_ATOMIC_PLANE_H__ */

View File

@ -28,6 +28,7 @@
#include <drm/i915_component.h>
#include "i915_drv.h"
#include "intel_atomic.h"
#include "intel_audio.h"
#include "intel_display_types.h"
#include "intel_lpe_audio.h"
@ -818,13 +819,8 @@ retry:
to_intel_atomic_state(state)->cdclk.force_min_cdclk =
enable ? 2 * 96000 : 0;
/*
* Protects dev_priv->cdclk.force_min_cdclk
* Need to lock this here in case we have no active pipes
* and thus wouldn't lock it during the commit otherwise.
*/
ret = drm_modeset_lock(&dev_priv->drm.mode_config.connection_mutex,
&ctx);
/* Protects dev_priv->cdclk.force_min_cdclk */
ret = intel_atomic_lock_global_state(to_intel_atomic_state(state));
if (!ret)
ret = drm_atomic_commit(state);

View File

@ -1918,6 +1918,19 @@ static int intel_pixel_rate_to_cdclk(const struct intel_crtc_state *crtc_state)
return DIV_ROUND_UP(pixel_rate * 100, 90);
}
static int intel_planes_min_cdclk(const struct intel_crtc_state *crtc_state)
{
struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
struct intel_plane *plane;
int min_cdclk = 0;
for_each_intel_plane_on_crtc(&dev_priv->drm, crtc, plane)
min_cdclk = max(crtc_state->min_cdclk[plane->id], min_cdclk);
return min_cdclk;
}
int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
{
struct drm_i915_private *dev_priv =
@ -1986,6 +1999,9 @@ int intel_crtc_compute_min_cdclk(const struct intel_crtc_state *crtc_state)
IS_GEMINILAKE(dev_priv))
min_cdclk = max(158400, min_cdclk);
/* Account for additional needs from the planes */
min_cdclk = max(intel_planes_min_cdclk(crtc_state), min_cdclk);
if (min_cdclk > dev_priv->max_cdclk_freq) {
DRM_DEBUG_KMS("required cdclk (%d kHz) exceeds max (%d kHz)\n",
min_cdclk, dev_priv->max_cdclk_freq);
@ -2007,11 +2023,20 @@ static int intel_compute_min_cdclk(struct intel_atomic_state *state)
sizeof(state->min_cdclk));
for_each_new_intel_crtc_in_state(state, crtc, crtc_state, i) {
int ret;
min_cdclk = intel_crtc_compute_min_cdclk(crtc_state);
if (min_cdclk < 0)
return min_cdclk;
if (state->min_cdclk[i] == min_cdclk)
continue;
state->min_cdclk[i] = min_cdclk;
ret = intel_atomic_lock_global_state(state);
if (ret)
return ret;
}
min_cdclk = state->cdclk.force_min_cdclk;
@ -2034,7 +2059,7 @@ static int intel_compute_min_cdclk(struct intel_atomic_state *state)
* future platforms this code will need to be
* adjusted.
*/
static u8 bxt_compute_min_voltage_level(struct intel_atomic_state *state)
static int bxt_compute_min_voltage_level(struct intel_atomic_state *state)
{
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
struct intel_crtc *crtc;
@ -2047,11 +2072,21 @@ static u8 bxt_compute_min_voltage_level(struct intel_atomic_state *state)
sizeof(state->min_voltage_level));
for_each_new_intel_crtc_in_state(state, crtc, crtc_state, i) {
int ret;
if (crtc_state->base.enable)
state->min_voltage_level[i] =
crtc_state->min_voltage_level;
min_voltage_level = crtc_state->min_voltage_level;
else
state->min_voltage_level[i] = 0;
min_voltage_level = 0;
if (state->min_voltage_level[i] == min_voltage_level)
continue;
state->min_voltage_level[i] = min_voltage_level;
ret = intel_atomic_lock_global_state(state);
if (ret)
return ret;
}
min_voltage_level = 0;
@ -2195,20 +2230,24 @@ static int skl_modeset_calc_cdclk(struct intel_atomic_state *state)
static int bxt_modeset_calc_cdclk(struct intel_atomic_state *state)
{
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
int min_cdclk, cdclk, vco;
int min_cdclk, min_voltage_level, cdclk, vco;
min_cdclk = intel_compute_min_cdclk(state);
if (min_cdclk < 0)
return min_cdclk;
min_voltage_level = bxt_compute_min_voltage_level(state);
if (min_voltage_level < 0)
return min_voltage_level;
cdclk = bxt_calc_cdclk(dev_priv, min_cdclk);
vco = bxt_calc_cdclk_pll_vco(dev_priv, cdclk);
state->cdclk.logical.vco = vco;
state->cdclk.logical.cdclk = cdclk;
state->cdclk.logical.voltage_level =
max(dev_priv->display.calc_voltage_level(cdclk),
bxt_compute_min_voltage_level(state));
max_t(int, min_voltage_level,
dev_priv->display.calc_voltage_level(cdclk));
if (!state->active_pipes) {
cdclk = bxt_calc_cdclk(dev_priv, state->cdclk.force_min_cdclk);
@ -2225,23 +2264,6 @@ static int bxt_modeset_calc_cdclk(struct intel_atomic_state *state)
return 0;
}
static int intel_lock_all_pipes(struct intel_atomic_state *state)
{
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
struct intel_crtc *crtc;
/* Add all pipes to the state */
for_each_intel_crtc(&dev_priv->drm, crtc) {
struct intel_crtc_state *crtc_state;
crtc_state = intel_atomic_get_crtc_state(&state->base, crtc);
if (IS_ERR(crtc_state))
return PTR_ERR(crtc_state);
}
return 0;
}
static int intel_modeset_all_pipes(struct intel_atomic_state *state)
{
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
@ -2308,48 +2330,63 @@ int intel_modeset_calc_cdclk(struct intel_atomic_state *state)
return ret;
/*
* Writes to dev_priv->cdclk.logical must protected by
* holding all the crtc locks, even if we don't end up
* Writes to dev_priv->cdclk.{actual,logical} must protected
* by holding all the crtc mutexes even if we don't end up
* touching the hardware
*/
if (intel_cdclk_changed(&dev_priv->cdclk.logical,
&state->cdclk.logical)) {
ret = intel_lock_all_pipes(state);
if (ret < 0)
if (intel_cdclk_changed(&dev_priv->cdclk.actual,
&state->cdclk.actual)) {
/*
* Also serialize commits across all crtcs
* if the actual hw needs to be poked.
*/
ret = intel_atomic_serialize_global_state(state);
if (ret)
return ret;
} else if (intel_cdclk_changed(&dev_priv->cdclk.logical,
&state->cdclk.logical)) {
ret = intel_atomic_lock_global_state(state);
if (ret)
return ret;
} else {
return 0;
}
if (is_power_of_2(state->active_pipes)) {
if (is_power_of_2(state->active_pipes) &&
intel_cdclk_needs_cd2x_update(dev_priv,
&dev_priv->cdclk.actual,
&state->cdclk.actual)) {
struct intel_crtc *crtc;
struct intel_crtc_state *crtc_state;
pipe = ilog2(state->active_pipes);
crtc = intel_get_crtc_for_pipe(dev_priv, pipe);
crtc_state = intel_atomic_get_new_crtc_state(state, crtc);
if (crtc_state &&
drm_atomic_crtc_needs_modeset(&crtc_state->base))
crtc_state = intel_atomic_get_crtc_state(&state->base, crtc);
if (IS_ERR(crtc_state))
return PTR_ERR(crtc_state);
if (drm_atomic_crtc_needs_modeset(&crtc_state->base))
pipe = INVALID_PIPE;
} else {
pipe = INVALID_PIPE;
}
/* All pipes must be switched off while we change the cdclk. */
if (pipe != INVALID_PIPE &&
intel_cdclk_needs_cd2x_update(dev_priv,
&dev_priv->cdclk.actual,
&state->cdclk.actual)) {
ret = intel_lock_all_pipes(state);
if (ret)
return ret;
if (pipe != INVALID_PIPE) {
state->cdclk.pipe = pipe;
DRM_DEBUG_KMS("Can change cdclk with pipe %c active\n",
pipe_name(pipe));
} else if (intel_cdclk_needs_modeset(&dev_priv->cdclk.actual,
&state->cdclk.actual)) {
/* All pipes must be switched off while we change the cdclk. */
ret = intel_modeset_all_pipes(state);
if (ret)
return ret;
state->cdclk.pipe = INVALID_PIPE;
DRM_DEBUG_KMS("Modeset required for cdclk change\n");
}
DRM_DEBUG_KMS("New cdclk calculated to be logical %u kHz, actual %u kHz\n",

View File

@ -844,7 +844,7 @@ load_detect:
}
/* for pre-945g platforms use load detect */
ret = intel_get_load_detect_pipe(connector, NULL, &tmp, ctx);
ret = intel_get_load_detect_pipe(connector, &tmp, ctx);
if (ret > 0) {
if (intel_crt_detect_ddc(connector))
status = connector_status_connected;
@ -864,6 +864,13 @@ load_detect:
out:
intel_display_power_put(dev_priv, intel_encoder->power_domain, wakeref);
/*
* Make sure the refs for power wells enabled during detect are
* dropped to avoid a new detect cycle triggered by HPD polling.
*/
intel_display_power_flush_work(dev_priv);
return status;
}
@ -994,9 +1001,9 @@ void intel_crt_init(struct drm_i915_private *dev_priv)
crt->base.type = INTEL_OUTPUT_ANALOG;
crt->base.cloneable = (1 << INTEL_OUTPUT_DVO) | (1 << INTEL_OUTPUT_HDMI);
if (IS_I830(dev_priv))
crt->base.crtc_mask = BIT(PIPE_A);
crt->base.pipe_mask = BIT(PIPE_A);
else
crt->base.crtc_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C);
crt->base.pipe_mask = ~0;
if (IS_GEN(dev_priv, 2))
connector->interlace_allowed = 0;

View File

@ -1905,6 +1905,9 @@ intel_ddi_transcoder_func_reg_val_get(const struct intel_crtc_state *crtc_state)
} else if (intel_crtc_has_type(crtc_state, INTEL_OUTPUT_DP_MST)) {
temp |= TRANS_DDI_MODE_SELECT_DP_MST;
temp |= DDI_PORT_WIDTH(crtc_state->lane_count);
if (INTEL_GEN(dev_priv) >= 12)
temp |= TRANS_DDI_MST_TRANSPORT_SELECT(crtc_state->cpu_transcoder);
} else {
temp |= TRANS_DDI_MODE_SELECT_DP_SST;
temp |= DDI_PORT_WIDTH(crtc_state->lane_count);
@ -2234,7 +2237,7 @@ static void intel_ddi_get_power_domains(struct intel_encoder *encoder,
/*
* VDSC power is needed when DSC is enabled
*/
if (crtc_state->dsc_params.compression_enable)
if (crtc_state->dsc.compression_enable)
intel_display_power_get(dev_priv,
intel_dsc_power_domain(crtc_state));
}
@ -2838,6 +2841,8 @@ tgl_dkl_phy_ddi_vswing_sequence(struct intel_encoder *encoder, int link_clock,
for (ln = 0; ln < 2; ln++) {
I915_WRITE(HIP_INDEX_REG(tc_port), HIP_INDEX_VAL(tc_port, ln));
I915_WRITE(DKL_TX_PMD_LANE_SUS(tc_port), 0);
/* All the registers are RMW */
val = I915_READ(DKL_TX_DPCNTL0(tc_port));
val &= ~dpcnt_mask;
@ -3870,12 +3875,12 @@ static i915_reg_t
gen9_chicken_trans_reg_by_port(struct drm_i915_private *dev_priv,
enum port port)
{
static const i915_reg_t regs[] = {
[PORT_A] = CHICKEN_TRANS_EDP,
[PORT_B] = CHICKEN_TRANS_A,
[PORT_C] = CHICKEN_TRANS_B,
[PORT_D] = CHICKEN_TRANS_C,
[PORT_E] = CHICKEN_TRANS_A,
static const enum transcoder trans[] = {
[PORT_A] = TRANSCODER_EDP,
[PORT_B] = TRANSCODER_A,
[PORT_C] = TRANSCODER_B,
[PORT_D] = TRANSCODER_C,
[PORT_E] = TRANSCODER_A,
};
WARN_ON(INTEL_GEN(dev_priv) < 9);
@ -3883,7 +3888,7 @@ gen9_chicken_trans_reg_by_port(struct drm_i915_private *dev_priv,
if (WARN_ON(port < PORT_A || port > PORT_E))
port = PORT_A;
return regs[port];
return CHICKEN_TRANS(trans[port]);
}
static void intel_enable_ddi_hdmi(struct intel_encoder *encoder,
@ -4683,7 +4688,6 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port)
struct intel_encoder *intel_encoder;
struct drm_encoder *encoder;
bool init_hdmi, init_dp, init_lspcon = false;
enum pipe pipe;
enum phy phy = intel_port_to_phy(dev_priv, port);
init_hdmi = port_info->supports_dvi || port_info->supports_hdmi;
@ -4735,8 +4739,7 @@ void intel_ddi_init(struct drm_i915_private *dev_priv, enum port port)
intel_encoder->power_domain = intel_port_to_power_domain(port);
intel_encoder->port = port;
intel_encoder->cloneable = 0;
for_each_pipe(dev_priv, pipe)
intel_encoder->crtc_mask |= BIT(pipe);
intel_encoder->pipe_mask = ~0;
if (INTEL_GEN(dev_priv) >= 11)
intel_dig_port->saved_port_bits = I915_READ(DDI_BUF_CTL(port)) &

View File

@ -55,6 +55,8 @@
#include "display/intel_tv.h"
#include "display/intel_vdsc.h"
#include "gt/intel_rps.h"
#include "i915_drv.h"
#include "i915_trace.h"
#include "intel_acpi.h"
@ -88,7 +90,17 @@ static const u32 i8xx_primary_formats[] = {
DRM_FORMAT_XRGB8888,
};
/* Primary plane formats for gen >= 4 */
/* Primary plane formats for ivb (no fp16 due to hw issue) */
static const u32 ivb_primary_formats[] = {
DRM_FORMAT_C8,
DRM_FORMAT_RGB565,
DRM_FORMAT_XRGB8888,
DRM_FORMAT_XBGR8888,
DRM_FORMAT_XRGB2101010,
DRM_FORMAT_XBGR2101010,
};
/* Primary plane formats for gen >= 4, except ivb */
static const u32 i965_primary_formats[] = {
DRM_FORMAT_C8,
DRM_FORMAT_RGB565,
@ -96,6 +108,7 @@ static const u32 i965_primary_formats[] = {
DRM_FORMAT_XBGR8888,
DRM_FORMAT_XRGB2101010,
DRM_FORMAT_XBGR2101010,
DRM_FORMAT_XBGR16161616F,
};
static const u64 i9xx_format_modifiers[] = {
@ -2971,6 +2984,8 @@ static int i9xx_format_to_fourcc(int format)
return DRM_FORMAT_XRGB2101010;
case DISPPLANE_RGBX101010:
return DRM_FORMAT_XBGR2101010;
case DISPPLANE_RGBX161616:
return DRM_FORMAT_XBGR16161616F;
}
}
@ -3154,6 +3169,7 @@ static void intel_plane_disable_noatomic(struct intel_crtc *crtc,
intel_set_plane_visible(crtc_state, plane_state, false);
fixup_active_planes(crtc_state);
crtc_state->data_rate[plane->id] = 0;
crtc_state->min_cdclk[plane->id] = 0;
if (plane->id == PLANE_PRIMARY)
intel_pre_disable_primary_noatomic(&crtc->base);
@ -3577,6 +3593,53 @@ int skl_check_plane_surface(struct intel_plane_state *plane_state)
return 0;
}
static void i9xx_plane_ratio(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state,
unsigned int *num, unsigned int *den)
{
const struct drm_framebuffer *fb = plane_state->base.fb;
unsigned int cpp = fb->format->cpp[0];
/*
* g4x bspec says 64bpp pixel rate can't exceed 80%
* of cdclk when the sprite plane is enabled on the
* same pipe. ilk/snb bspec says 64bpp pixel rate is
* never allowed to exceed 80% of cdclk. Let's just go
* with the ilk/snb limit always.
*/
if (cpp == 8) {
*num = 10;
*den = 8;
} else {
*num = 1;
*den = 1;
}
}
static int i9xx_plane_min_cdclk(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state)
{
unsigned int pixel_rate;
unsigned int num, den;
/*
* Note that crtc_state->pixel_rate accounts for both
* horizontal and vertical panel fitter downscaling factors.
* Pre-HSW bspec tells us to only consider the horizontal
* downscaling factor here. We ignore that and just consider
* both for simplicity.
*/
pixel_rate = crtc_state->pixel_rate;
i9xx_plane_ratio(crtc_state, plane_state, &num, &den);
/* two pixels per clock with double wide pipe */
if (crtc_state->double_wide)
den *= 2;
return DIV_ROUND_UP(pixel_rate * num, den);
}
unsigned int
i9xx_plane_max_stride(struct intel_plane *plane,
u32 pixel_format, u64 modifier,
@ -3659,6 +3722,9 @@ static u32 i9xx_plane_ctl(const struct intel_crtc_state *crtc_state,
case DRM_FORMAT_XBGR2101010:
dspcntr |= DISPPLANE_RGBX101010;
break;
case DRM_FORMAT_XBGR16161616F:
dspcntr |= DISPPLANE_RGBX161616;
break;
default:
MISSING_CASE(fb->format->format);
return 0;
@ -3681,7 +3747,8 @@ int i9xx_check_plane_surface(struct intel_plane_state *plane_state)
{
struct drm_i915_private *dev_priv =
to_i915(plane_state->base.plane->dev);
int src_x, src_y;
const struct drm_framebuffer *fb = plane_state->base.fb;
int src_x, src_y, src_w;
u32 offset;
int ret;
@ -3692,9 +3759,14 @@ int i9xx_check_plane_surface(struct intel_plane_state *plane_state)
if (!plane_state->base.visible)
return 0;
src_w = drm_rect_width(&plane_state->base.src) >> 16;
src_x = plane_state->base.src.x1 >> 16;
src_y = plane_state->base.src.y1 >> 16;
/* Undocumented hardware limit on i965/g4x/vlv/chv */
if (HAS_GMCH(dev_priv) && fb->format->cpp[0] == 8 && src_w > 2048)
return -EINVAL;
intel_add_fb_offsets(&src_x, &src_y, plane_state, 0);
if (INTEL_GEN(dev_priv) >= 4)
@ -5592,10 +5664,6 @@ static int skl_update_scaler_plane(struct intel_crtc_state *crtc_state,
case DRM_FORMAT_ARGB8888:
case DRM_FORMAT_XRGB2101010:
case DRM_FORMAT_XBGR2101010:
case DRM_FORMAT_XBGR16161616F:
case DRM_FORMAT_ABGR16161616F:
case DRM_FORMAT_XRGB16161616F:
case DRM_FORMAT_ARGB16161616F:
case DRM_FORMAT_YUYV:
case DRM_FORMAT_YVYU:
case DRM_FORMAT_UYVY:
@ -5611,6 +5679,13 @@ static int skl_update_scaler_plane(struct intel_crtc_state *crtc_state,
case DRM_FORMAT_XVYU12_16161616:
case DRM_FORMAT_XVYU16161616:
break;
case DRM_FORMAT_XBGR16161616F:
case DRM_FORMAT_ABGR16161616F:
case DRM_FORMAT_XRGB16161616F:
case DRM_FORMAT_ARGB16161616F:
if (INTEL_GEN(dev_priv) >= 11)
break;
/* fall through */
default:
DRM_DEBUG_KMS("[PLANE:%d:%s] FB:%d unsupported scaling format 0x%x\n",
intel_plane->base.base.id, intel_plane->base.name,
@ -9359,7 +9434,6 @@ static bool wrpll_uses_pch_ssc(struct drm_i915_private *dev_priv,
static void lpt_init_pch_refclk(struct drm_i915_private *dev_priv)
{
struct intel_encoder *encoder;
bool pch_ssc_in_use = false;
bool has_fdi = false;
for_each_intel_encoder(&dev_priv->drm, encoder) {
@ -9387,22 +9461,24 @@ static void lpt_init_pch_refclk(struct drm_i915_private *dev_priv)
* clock hierarchy. That would also allow us to do
* clock bending finally.
*/
dev_priv->pch_ssc_use = 0;
if (spll_uses_pch_ssc(dev_priv)) {
DRM_DEBUG_KMS("SPLL using PCH SSC\n");
pch_ssc_in_use = true;
dev_priv->pch_ssc_use |= BIT(DPLL_ID_SPLL);
}
if (wrpll_uses_pch_ssc(dev_priv, DPLL_ID_WRPLL1)) {
DRM_DEBUG_KMS("WRPLL1 using PCH SSC\n");
pch_ssc_in_use = true;
dev_priv->pch_ssc_use |= BIT(DPLL_ID_WRPLL1);
}
if (wrpll_uses_pch_ssc(dev_priv, DPLL_ID_WRPLL2)) {
DRM_DEBUG_KMS("WRPLL2 using PCH SSC\n");
pch_ssc_in_use = true;
dev_priv->pch_ssc_use |= BIT(DPLL_ID_WRPLL2);
}
if (pch_ssc_in_use)
if (dev_priv->pch_ssc_use)
return;
if (has_fdi) {
@ -10871,7 +10947,7 @@ static void i845_update_cursor(struct intel_plane *plane,
unsigned long irqflags;
if (plane_state && plane_state->base.visible) {
unsigned int width = drm_rect_width(&plane_state->base.src);
unsigned int width = drm_rect_width(&plane_state->base.dst);
unsigned int height = drm_rect_height(&plane_state->base.dst);
cntl = plane_state->ctl |
@ -11252,7 +11328,6 @@ static int intel_modeset_disable_planes(struct drm_atomic_state *state,
}
int intel_get_load_detect_pipe(struct drm_connector *connector,
const struct drm_display_mode *mode,
struct intel_load_detect_pipe *old,
struct drm_modeset_acquire_ctx *ctx)
{
@ -11359,10 +11434,8 @@ found:
crtc_state->base.active = crtc_state->base.enable = true;
if (!mode)
mode = &load_detect_mode;
ret = drm_atomic_set_mode_for_crtc(&crtc_state->base, mode);
ret = drm_atomic_set_mode_for_crtc(&crtc_state->base,
&load_detect_mode);
if (ret)
goto fail;
@ -11706,6 +11779,7 @@ int intel_plane_atomic_calc_changes(const struct intel_crtc_state *old_crtc_stat
plane_state->base.visible = visible = false;
crtc_state->active_planes &= ~BIT(plane->id);
crtc_state->data_rate[plane->id] = 0;
crtc_state->min_cdclk[plane->id] = 0;
}
if (!was_visible && !visible)
@ -12072,11 +12146,6 @@ static int intel_crtc_atomic_check(struct intel_atomic_state *state,
if (INTEL_GEN(dev_priv) >= 9) {
if (mode_changed || crtc_state->update_pipe)
ret = skl_update_scaler_crtc(crtc_state);
if (!ret)
ret = icl_check_nv12_planes(crtc_state);
if (!ret)
ret = skl_check_pipe_max_pixel_rate(crtc, crtc_state);
if (!ret)
ret = intel_atomic_setup_scalers(dev_priv, crtc,
crtc_state);
@ -12425,6 +12494,12 @@ static bool check_digital_port_conflicts(struct intel_atomic_state *state)
unsigned int used_mst_ports = 0;
bool ret = true;
/*
* We're going to peek into connector->state,
* hence connection_mutex must be held.
*/
drm_modeset_lock_assert_held(&dev->mode_config.connection_mutex);
/*
* Walk the connector list instead of the encoder
* list to detect the problem on ddi platforms
@ -13712,11 +13787,6 @@ static int intel_modeset_checks(struct intel_atomic_state *state)
struct intel_crtc *crtc;
int ret, i;
if (!check_digital_port_conflicts(state)) {
DRM_DEBUG_KMS("rejecting conflicting digital port configuration\n");
return -EINVAL;
}
/* keep the current setting */
if (!state->cdclk.force_min_cdclk_changed)
state->cdclk.force_min_cdclk = dev_priv->cdclk.force_min_cdclk;
@ -13725,7 +13795,6 @@ static int intel_modeset_checks(struct intel_atomic_state *state)
state->active_pipes = dev_priv->active_pipes;
state->cdclk.logical = dev_priv->cdclk.logical;
state->cdclk.actual = dev_priv->cdclk.actual;
state->cdclk.pipe = INVALID_PIPE;
for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
new_crtc_state, i) {
@ -13738,6 +13807,12 @@ static int intel_modeset_checks(struct intel_atomic_state *state)
state->active_pipe_changes |= BIT(crtc->pipe);
}
if (state->active_pipe_changes) {
ret = intel_atomic_lock_global_state(state);
if (ret)
return ret;
}
ret = intel_modeset_calc_cdclk(state);
if (ret)
return ret;
@ -13790,12 +13865,49 @@ static void intel_crtc_check_fastset(const struct intel_crtc_state *old_crtc_sta
new_crtc_state->has_drrs = old_crtc_state->has_drrs;
}
static int intel_atomic_check_planes(struct intel_atomic_state *state)
static int intel_crtc_add_planes_to_state(struct intel_atomic_state *state,
struct intel_crtc *crtc,
u8 plane_ids_mask)
{
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
struct intel_plane *plane;
for_each_intel_plane_on_crtc(&dev_priv->drm, crtc, plane) {
struct intel_plane_state *plane_state;
if ((plane_ids_mask & BIT(plane->id)) == 0)
continue;
plane_state = intel_atomic_get_plane_state(state, plane);
if (IS_ERR(plane_state))
return PTR_ERR(plane_state);
}
return 0;
}
static bool active_planes_affects_min_cdclk(struct drm_i915_private *dev_priv)
{
/* See {hsw,vlv,ivb}_plane_ratio() */
return IS_BROADWELL(dev_priv) || IS_HASWELL(dev_priv) ||
IS_CHERRYVIEW(dev_priv) || IS_VALLEYVIEW(dev_priv) ||
IS_IVYBRIDGE(dev_priv);
}
static int intel_atomic_check_planes(struct intel_atomic_state *state,
bool *need_modeset)
{
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
struct intel_crtc_state *old_crtc_state, *new_crtc_state;
struct intel_plane_state *plane_state;
struct intel_plane *plane;
struct intel_crtc *crtc;
int i, ret;
ret = icl_add_linked_planes(state);
if (ret)
return ret;
for_each_new_intel_plane_in_state(state, plane, plane_state, i) {
ret = intel_plane_atomic_check(state, plane);
if (ret) {
@ -13805,6 +13917,41 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state)
}
}
for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
new_crtc_state, i) {
u8 old_active_planes, new_active_planes;
ret = icl_check_nv12_planes(new_crtc_state);
if (ret)
return ret;
/*
* On some platforms the number of active planes affects
* the planes' minimum cdclk calculation. Add such planes
* to the state before we compute the minimum cdclk.
*/
if (!active_planes_affects_min_cdclk(dev_priv))
continue;
old_active_planes = old_crtc_state->active_planes & ~BIT(PLANE_CURSOR);
new_active_planes = new_crtc_state->active_planes & ~BIT(PLANE_CURSOR);
if (hweight8(old_active_planes) == hweight8(new_active_planes))
continue;
ret = intel_crtc_add_planes_to_state(state, crtc, new_active_planes);
if (ret)
return ret;
}
/*
* active_planes bitmask has been updated, and potentially
* affected planes are part of the state. We can now
* compute the minimum cdclk for each plane.
*/
for_each_new_intel_plane_in_state(state, plane, plane_state, i)
*need_modeset |= intel_plane_calc_min_cdclk(state, plane);
return 0;
}
@ -13839,7 +13986,7 @@ static int intel_atomic_check(struct drm_device *dev,
struct intel_crtc_state *old_crtc_state, *new_crtc_state;
struct intel_crtc *crtc;
int ret, i;
bool any_ms = state->cdclk.force_min_cdclk_changed;
bool any_ms = false;
/* Catch I915_MODE_FLAG_INHERITED */
for_each_oldnew_intel_crtc_in_state(state, crtc, old_crtc_state,
@ -13873,10 +14020,22 @@ static int intel_atomic_check(struct drm_device *dev,
any_ms = true;
}
if (any_ms && !check_digital_port_conflicts(state)) {
DRM_DEBUG_KMS("rejecting conflicting digital port configuration\n");
ret = EINVAL;
goto fail;
}
ret = drm_dp_mst_atomic_check(&state->base);
if (ret)
goto fail;
any_ms |= state->cdclk.force_min_cdclk_changed;
ret = intel_atomic_check_planes(state, &any_ms);
if (ret)
goto fail;
if (any_ms) {
ret = intel_modeset_checks(state);
if (ret)
@ -13885,14 +14044,6 @@ static int intel_atomic_check(struct drm_device *dev,
state->cdclk.logical = dev_priv->cdclk.logical;
}
ret = icl_add_linked_planes(state);
if (ret)
goto fail;
ret = intel_atomic_check_planes(state);
if (ret)
goto fail;
ret = intel_atomic_check_crtcs(state);
if (ret)
goto fail;
@ -13973,9 +14124,6 @@ static void intel_pipe_fastset(const struct intel_crtc_state *old_crtc_state,
struct intel_crtc *crtc = to_intel_crtc(new_crtc_state->base.crtc);
struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
/* drm_atomic_helper_update_legacy_modeset_state might not be called. */
crtc->base.mode = new_crtc_state->base.mode;
/*
* Update pipe size and adjust fitter if needed: the reason for this is
* that in compute_mode_changes we check the native mode (not the pfit
@ -14237,8 +14385,8 @@ static void intel_crtc_enable_trans_port_sync(struct intel_crtc *crtc,
static void intel_set_dp_tp_ctl_normal(struct intel_crtc *crtc,
struct intel_atomic_state *state)
{
struct drm_connector *uninitialized_var(conn);
struct drm_connector_state *conn_state;
struct drm_connector *conn;
struct intel_dp *intel_dp;
int i;
@ -14670,6 +14818,14 @@ static void intel_atomic_track_fbs(struct intel_atomic_state *state)
plane->frontbuffer_bit);
}
static void assert_global_state_locked(struct drm_i915_private *dev_priv)
{
struct intel_crtc *crtc;
for_each_intel_crtc(&dev_priv->drm, crtc)
drm_modeset_lock_assert_held(&crtc->base.mutex);
}
static int intel_atomic_commit(struct drm_device *dev,
struct drm_atomic_state *_state,
bool nonblock)
@ -14735,7 +14891,9 @@ static int intel_atomic_commit(struct drm_device *dev,
intel_shared_dpll_swap_state(state);
intel_atomic_track_fbs(state);
if (state->modeset) {
if (state->global_state_changed) {
assert_global_state_locked(dev_priv);
memcpy(dev_priv->min_cdclk, state->min_cdclk,
sizeof(state->min_cdclk));
memcpy(dev_priv->min_voltage_level, state->min_voltage_level,
@ -14782,7 +14940,7 @@ static int do_rps_boost(struct wait_queue_entry *_wait,
* vblank without our intervention, so leave RPS alone.
*/
if (!i915_request_started(rq))
gen6_rps_boost(rq);
intel_rps_boost(rq);
i915_request_put(rq);
drm_crtc_vblank_put(wait->crtc);
@ -14863,7 +15021,7 @@ static void intel_plane_unpin_fb(struct intel_plane_state *old_plane_state)
static void fb_obj_bump_render_priority(struct drm_i915_gem_object *obj)
{
struct i915_sched_attr attr = {
.priority = I915_PRIORITY_DISPLAY,
.priority = I915_USER_PRIORITY(I915_PRIORITY_DISPLAY),
};
i915_gem_object_wait_priority(obj, 0, &attr);
@ -14976,7 +15134,7 @@ intel_prepare_plane_fb(struct drm_plane *plane,
* maximum clocks following a vblank miss (see do_rps_boost()).
*/
if (!intel_state->rps_interactive) {
intel_rps_mark_interactive(dev_priv, true);
intel_rps_mark_interactive(&dev_priv->gt.rps, true);
intel_state->rps_interactive = true;
}
@ -15001,7 +15159,7 @@ intel_cleanup_plane_fb(struct drm_plane *plane,
struct drm_i915_private *dev_priv = to_i915(plane->dev);
if (intel_state->rps_interactive) {
intel_rps_mark_interactive(dev_priv, false);
intel_rps_mark_interactive(&dev_priv->gt.rps, false);
intel_state->rps_interactive = false;
}
@ -15009,44 +15167,6 @@ intel_cleanup_plane_fb(struct drm_plane *plane,
intel_plane_unpin_fb(old_plane_state);
}
int
skl_max_scale(const struct intel_crtc_state *crtc_state,
const struct drm_format_info *format)
{
struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
int max_scale;
int crtc_clock, max_dotclk, tmpclk1, tmpclk2;
if (!crtc_state->base.enable)
return DRM_PLANE_HELPER_NO_SCALING;
crtc_clock = crtc_state->base.adjusted_mode.crtc_clock;
max_dotclk = to_intel_atomic_state(crtc_state->base.state)->cdclk.logical.cdclk;
if (IS_GEMINILAKE(dev_priv) || INTEL_GEN(dev_priv) >= 10)
max_dotclk *= 2;
if (WARN_ON_ONCE(!crtc_clock || max_dotclk < crtc_clock))
return DRM_PLANE_HELPER_NO_SCALING;
/*
* skl max scale is lower of:
* close to 3 but not 3, -1 is for that purpose
* or
* cdclk/crtc_clock
*/
if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv) ||
!drm_format_info_is_yuv_semiplanar(format))
tmpclk1 = 0x30000 - 1;
else
tmpclk1 = 0x20000 - 1;
tmpclk2 = (1 << 8) * ((max_dotclk << 8) / crtc_clock);
max_scale = min(tmpclk1, tmpclk2);
return max_scale;
}
/**
* intel_plane_destroy - destroy a plane
* @plane: plane to destroy
@ -15101,6 +15221,7 @@ static bool i965_plane_format_mod_supported(struct drm_plane *_plane,
case DRM_FORMAT_XBGR8888:
case DRM_FORMAT_XRGB2101010:
case DRM_FORMAT_XBGR2101010:
case DRM_FORMAT_XBGR16161616F:
return modifier == DRM_FORMAT_MOD_LINEAR ||
modifier == I915_FORMAT_MOD_X_TILED;
default:
@ -15321,8 +15442,26 @@ intel_primary_plane_create(struct drm_i915_private *dev_priv, enum pipe pipe)
}
if (INTEL_GEN(dev_priv) >= 4) {
formats = i965_primary_formats;
num_formats = ARRAY_SIZE(i965_primary_formats);
/*
* WaFP16GammaEnabling:ivb
* "Workaround : When using the 64-bit format, the plane
* output on each color channel has one quarter amplitude.
* It can be brought up to full amplitude by using pipe
* gamma correction or pipe color space conversion to
* multiply the plane output by four."
*
* There is no dedicated plane gamma for the primary plane,
* and using the pipe gamma/csc could conflict with other
* planes, so we choose not to expose fp16 on IVB primary
* planes. HSW primary planes no longer have this problem.
*/
if (IS_IVYBRIDGE(dev_priv)) {
formats = ivb_primary_formats;
num_formats = ARRAY_SIZE(ivb_primary_formats);
} else {
formats = i965_primary_formats;
num_formats = ARRAY_SIZE(i965_primary_formats);
}
modifiers = i9xx_format_modifiers;
plane->max_stride = i9xx_plane_max_stride;
@ -15331,6 +15470,15 @@ intel_primary_plane_create(struct drm_i915_private *dev_priv, enum pipe pipe)
plane->get_hw_state = i9xx_plane_get_hw_state;
plane->check_plane = i9xx_plane_check;
if (IS_BROADWELL(dev_priv) || IS_HASWELL(dev_priv))
plane->min_cdclk = hsw_plane_min_cdclk;
else if (IS_IVYBRIDGE(dev_priv))
plane->min_cdclk = ivb_plane_min_cdclk;
else if (IS_CHERRYVIEW(dev_priv) || IS_VALLEYVIEW(dev_priv))
plane->min_cdclk = vlv_plane_min_cdclk;
else
plane->min_cdclk = i9xx_plane_min_cdclk;
plane_funcs = &i965_plane_funcs;
} else {
formats = i8xx_primary_formats;
@ -15342,6 +15490,7 @@ intel_primary_plane_create(struct drm_i915_private *dev_priv, enum pipe pipe)
plane->disable_plane = i9xx_disable_plane;
plane->get_hw_state = i9xx_plane_get_hw_state;
plane->check_plane = i9xx_plane_check;
plane->min_cdclk = i9xx_plane_min_cdclk;
plane_funcs = &i8xx_plane_funcs;
}
@ -15693,7 +15842,7 @@ static u32 intel_encoder_possible_crtcs(struct intel_encoder *encoder)
u32 possible_crtcs = 0;
for_each_intel_crtc(dev, crtc) {
if (encoder->crtc_mask & BIT(crtc->pipe))
if (encoder->pipe_mask & BIT(crtc->pipe))
possible_crtcs |= drm_crtc_mask(&crtc->base);
}
@ -16294,6 +16443,21 @@ intel_mode_valid(struct drm_device *dev,
mode->vtotal > vtotal_max)
return MODE_V_ILLEGAL;
if (INTEL_GEN(dev_priv) >= 5) {
if (mode->hdisplay < 64 ||
mode->htotal - mode->hdisplay < 32)
return MODE_H_ILLEGAL;
if (mode->vtotal - mode->vdisplay < 5)
return MODE_V_ILLEGAL;
} else {
if (mode->htotal - mode->hdisplay < 32)
return MODE_H_ILLEGAL;
if (mode->vtotal - mode->vdisplay < 3)
return MODE_V_ILLEGAL;
}
return MODE_OK;
}
@ -17224,13 +17388,16 @@ static void intel_modeset_readout_hw_state(struct drm_device *dev)
struct intel_plane *plane;
int min_cdclk = 0;
memset(&crtc->base.mode, 0, sizeof(crtc->base.mode));
if (crtc_state->base.active) {
intel_mode_from_pipe_config(&crtc->base.mode, crtc_state);
crtc->base.mode.hdisplay = crtc_state->pipe_src_w;
crtc->base.mode.vdisplay = crtc_state->pipe_src_h;
intel_mode_from_pipe_config(&crtc_state->base.adjusted_mode, crtc_state);
WARN_ON(drm_atomic_set_mode_for_crtc(&crtc_state->base, &crtc->base.mode));
struct drm_display_mode mode;
intel_mode_from_pipe_config(&crtc_state->base.adjusted_mode,
crtc_state);
mode = crtc_state->base.adjusted_mode;
mode.hdisplay = crtc_state->pipe_src_w;
mode.vdisplay = crtc_state->pipe_src_h;
WARN_ON(drm_atomic_set_mode_for_crtc(&crtc_state->base, &mode));
/*
* The initial mode needs to be set in order to keep
@ -17245,17 +17412,9 @@ static void intel_modeset_readout_hw_state(struct drm_device *dev)
intel_crtc_compute_pixel_rate(crtc_state);
min_cdclk = intel_crtc_compute_min_cdclk(crtc_state);
if (WARN_ON(min_cdclk < 0))
min_cdclk = 0;
intel_crtc_update_active_timings(crtc_state);
}
dev_priv->min_cdclk[crtc->pipe] = min_cdclk;
dev_priv->min_voltage_level[crtc->pipe] =
crtc_state->min_voltage_level;
for_each_intel_plane_on_crtc(&dev_priv->drm, crtc, plane) {
const struct intel_plane_state *plane_state =
to_intel_plane_state(plane->base.state);
@ -17267,8 +17426,34 @@ static void intel_modeset_readout_hw_state(struct drm_device *dev)
if (plane_state->base.visible)
crtc_state->data_rate[plane->id] =
4 * crtc_state->pixel_rate;
/*
* FIXME don't have the fb yet, so can't
* use plane->min_cdclk() :(
*/
if (plane_state->base.visible && plane->min_cdclk) {
if (crtc_state->double_wide ||
INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv))
crtc_state->min_cdclk[plane->id] =
DIV_ROUND_UP(crtc_state->pixel_rate, 2);
else
crtc_state->min_cdclk[plane->id] =
crtc_state->pixel_rate;
}
DRM_DEBUG_KMS("[PLANE:%d:%s] min_cdclk %d kHz\n",
plane->base.base.id, plane->base.name,
crtc_state->min_cdclk[plane->id]);
}
if (crtc_state->base.active) {
min_cdclk = intel_crtc_compute_min_cdclk(crtc_state);
if (WARN_ON(min_cdclk < 0))
min_cdclk = 0;
}
dev_priv->min_cdclk[crtc->pipe] = min_cdclk;
dev_priv->min_voltage_level[crtc->pipe] =
crtc_state->min_voltage_level;
intel_bw_crtc_update(bw_state, crtc_state);
intel_pipe_config_sanity_check(dev_priv, crtc_state);

View File

@ -509,7 +509,6 @@ void vlv_wait_port_ready(struct drm_i915_private *dev_priv,
struct intel_digital_port *dport,
unsigned int expected_mask);
int intel_get_load_detect_pipe(struct drm_connector *connector,
const struct drm_display_mode *mode,
struct intel_load_detect_pipe *old,
struct drm_modeset_acquire_ctx *ctx);
void intel_release_load_detect_pipe(struct drm_connector *connector,
@ -563,8 +562,6 @@ void intel_crtc_arm_fifo_underrun(struct intel_crtc *crtc,
u16 skl_scaler_calc_phase(int sub, int scale, bool chroma_center);
int skl_update_scaler_crtc(struct intel_crtc_state *crtc_state);
int skl_max_scale(const struct intel_crtc_state *crtc_state,
const struct drm_format_info *format);
u32 glk_plane_color_ctl(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state);
u32 glk_plane_color_ctl_crtc(const struct intel_crtc_state *crtc_state);

View File

@ -2682,6 +2682,8 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
TGL_PW_2_POWER_DOMAINS | \
BIT_ULL(POWER_DOMAIN_MODESET) | \
BIT_ULL(POWER_DOMAIN_AUX_A) | \
BIT_ULL(POWER_DOMAIN_AUX_B) | \
BIT_ULL(POWER_DOMAIN_AUX_C) | \
BIT_ULL(POWER_DOMAIN_INIT))
#define TGL_DDI_IO_D_TC1_POWER_DOMAINS ( \

View File

@ -128,7 +128,8 @@ struct intel_encoder {
enum intel_output_type type;
enum port port;
unsigned int cloneable;
u16 cloneable;
u8 pipe_mask;
enum intel_hotplug_state (*hotplug)(struct intel_encoder *encoder,
struct intel_connector *connector,
bool irq_received);
@ -187,7 +188,6 @@ struct intel_encoder {
* device interrupts are disabled.
*/
void (*suspend)(struct intel_encoder *);
int crtc_mask;
enum hpd_pin hpd_pin;
enum intel_display_power_domain power_domain;
/* for communication with audio component; protected by av_mutex */
@ -506,6 +506,14 @@ struct intel_atomic_state {
bool rps_interactive;
/*
* active_pipes
* min_cdclk[]
* min_voltage_level[]
* cdclk.*
*/
bool global_state_changed;
/* Gen9+ only */
struct skl_ddb_values wm_results;
@ -932,6 +940,8 @@ struct intel_crtc_state {
struct intel_crtc_wm_state wm;
int min_cdclk[I915_MAX_PLANES];
u32 data_rate[I915_MAX_PLANES];
/* Gamma mode programmed on the pipe */
@ -986,8 +996,8 @@ struct intel_crtc_state {
bool dsc_split;
u16 compressed_bpp;
u8 slice_count;
} dsc_params;
struct drm_dsc_config dp_dsc_cfg;
struct drm_dsc_config config;
} dsc;
/* Forward Error correction State */
bool fec_enable;
@ -1077,6 +1087,8 @@ struct intel_plane {
bool (*get_hw_state)(struct intel_plane *plane, enum pipe *pipe);
int (*check_plane)(struct intel_crtc_state *crtc_state,
struct intel_plane_state *plane_state);
int (*min_cdclk)(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state);
};
struct intel_watermark_params {

View File

@ -1179,18 +1179,20 @@ intel_dp_aux_wait_done(struct intel_dp *intel_dp)
{
struct drm_i915_private *i915 = dp_to_i915(intel_dp);
i915_reg_t ch_ctl = intel_dp->aux_ch_ctl_reg(intel_dp);
const unsigned int timeout_ms = 10;
u32 status;
bool done;
#define C (((status = intel_uncore_read_notrace(&i915->uncore, ch_ctl)) & DP_AUX_CH_CTL_SEND_BUSY) == 0)
done = wait_event_timeout(i915->gmbus_wait_queue, C,
msecs_to_jiffies_timeout(10));
msecs_to_jiffies_timeout(timeout_ms));
/* just trace the final value */
trace_i915_reg_rw(false, ch_ctl, status, sizeof(status), true);
if (!done)
DRM_ERROR("dp aux hw did not signal timeout!\n");
DRM_ERROR("%s did not complete or timeout within %ums (status 0x%08x)\n",
intel_dp->aux.name, timeout_ms, status);
#undef C
return status;
@ -1291,6 +1293,9 @@ static u32 skl_get_aux_send_ctl(struct intel_dp *intel_dp,
u32 unused)
{
struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
struct drm_i915_private *i915 =
to_i915(intel_dig_port->base.base.dev);
enum phy phy = intel_port_to_phy(i915, intel_dig_port->base.port);
u32 ret;
ret = DP_AUX_CH_CTL_SEND_BUSY |
@ -1303,7 +1308,8 @@ static u32 skl_get_aux_send_ctl(struct intel_dp *intel_dp,
DP_AUX_CH_CTL_FW_SYNC_PULSE_SKL(32) |
DP_AUX_CH_CTL_SYNC_PULSE_SKL(32);
if (intel_dig_port->tc_mode == TC_PORT_TBT_ALT)
if (intel_phy_is_tc(i915, phy) &&
intel_dig_port->tc_mode == TC_PORT_TBT_ALT)
ret |= DP_AUX_CH_CTL_TBT_IO;
return ret;
@ -1888,6 +1894,9 @@ static bool intel_dp_source_supports_dsc(struct intel_dp *intel_dp,
{
struct drm_i915_private *dev_priv = dp_to_i915(intel_dp);
if (!INTEL_INFO(dev_priv)->display.has_dsc)
return false;
/* On TGL, DSC is supported on all Pipes */
if (INTEL_GEN(dev_priv) >= 12)
return true;
@ -2080,10 +2089,10 @@ static int intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
pipe_config->lane_count = limits->max_lane_count;
if (intel_dp_is_edp(intel_dp)) {
pipe_config->dsc_params.compressed_bpp =
pipe_config->dsc.compressed_bpp =
min_t(u16, drm_edp_dsc_sink_output_bpp(intel_dp->dsc_dpcd) >> 4,
pipe_config->pipe_bpp);
pipe_config->dsc_params.slice_count =
pipe_config->dsc.slice_count =
drm_dp_dsc_sink_max_slice_count(intel_dp->dsc_dpcd,
true);
} else {
@ -2104,10 +2113,10 @@ static int intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
DRM_DEBUG_KMS("Compressed BPP/Slice Count not supported\n");
return -EINVAL;
}
pipe_config->dsc_params.compressed_bpp = min_t(u16,
pipe_config->dsc.compressed_bpp = min_t(u16,
dsc_max_output_bpp >> 4,
pipe_config->pipe_bpp);
pipe_config->dsc_params.slice_count = dsc_dp_slice_count;
pipe_config->dsc.slice_count = dsc_dp_slice_count;
}
/*
* VDSC engine operates at 1 Pixel per clock, so if peak pixel rate
@ -2115,8 +2124,8 @@ static int intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
* then we need to use 2 VDSC instances.
*/
if (adjusted_mode->crtc_clock > dev_priv->max_cdclk_freq) {
if (pipe_config->dsc_params.slice_count > 1) {
pipe_config->dsc_params.dsc_split = true;
if (pipe_config->dsc.slice_count > 1) {
pipe_config->dsc.dsc_split = true;
} else {
DRM_DEBUG_KMS("Cannot split stream to use 2 VDSC instances\n");
return -EINVAL;
@ -2128,16 +2137,16 @@ static int intel_dp_dsc_compute_config(struct intel_dp *intel_dp,
DRM_DEBUG_KMS("Cannot compute valid DSC parameters for Input Bpp = %d "
"Compressed BPP = %d\n",
pipe_config->pipe_bpp,
pipe_config->dsc_params.compressed_bpp);
pipe_config->dsc.compressed_bpp);
return ret;
}
pipe_config->dsc_params.compression_enable = true;
pipe_config->dsc.compression_enable = true;
DRM_DEBUG_KMS("DP DSC computed with Input Bpp = %d "
"Compressed Bpp = %d Slice Count = %d\n",
pipe_config->pipe_bpp,
pipe_config->dsc_params.compressed_bpp,
pipe_config->dsc_params.slice_count);
pipe_config->dsc.compressed_bpp,
pipe_config->dsc.slice_count);
return 0;
}
@ -2211,15 +2220,15 @@ intel_dp_compute_link_config(struct intel_encoder *encoder,
return ret;
}
if (pipe_config->dsc_params.compression_enable) {
if (pipe_config->dsc.compression_enable) {
DRM_DEBUG_KMS("DP lane count %d clock %d Input bpp %d Compressed bpp %d\n",
pipe_config->lane_count, pipe_config->port_clock,
pipe_config->pipe_bpp,
pipe_config->dsc_params.compressed_bpp);
pipe_config->dsc.compressed_bpp);
DRM_DEBUG_KMS("DP link rate required %i available %i\n",
intel_dp_link_required(adjusted_mode->crtc_clock,
pipe_config->dsc_params.compressed_bpp),
pipe_config->dsc.compressed_bpp),
intel_dp_max_data_rate(pipe_config->port_clock,
pipe_config->lane_count));
} else {
@ -2377,8 +2386,8 @@ intel_dp_compute_config(struct intel_encoder *encoder,
pipe_config->limited_color_range =
intel_dp_limited_color_range(pipe_config, conn_state);
if (pipe_config->dsc_params.compression_enable)
output_bpp = pipe_config->dsc_params.compressed_bpp;
if (pipe_config->dsc.compression_enable)
output_bpp = pipe_config->dsc.compressed_bpp;
else
output_bpp = intel_dp_output_bpp(pipe_config, pipe_config->pipe_bpp);
@ -3102,7 +3111,7 @@ void intel_dp_sink_set_decompression_state(struct intel_dp *intel_dp,
{
int ret;
if (!crtc_state->dsc_params.compression_enable)
if (!crtc_state->dsc.compression_enable)
return;
ret = drm_dp_dpcd_writeb(&intel_dp->aux, DP_DSC_ENABLE,
@ -5688,6 +5697,12 @@ out:
if (status != connector_status_connected && !intel_dp->is_mst)
intel_dp_unset_edid(intel_dp);
/*
* Make sure the refs for power wells enabled during detect are
* dropped to avoid a new detect cycle triggered by HPD polling.
*/
intel_display_power_flush_work(dev_priv);
return status;
}
@ -7560,11 +7575,11 @@ bool intel_dp_init(struct drm_i915_private *dev_priv,
intel_encoder->power_domain = intel_port_to_power_domain(port);
if (IS_CHERRYVIEW(dev_priv)) {
if (port == PORT_D)
intel_encoder->crtc_mask = BIT(PIPE_C);
intel_encoder->pipe_mask = BIT(PIPE_C);
else
intel_encoder->crtc_mask = BIT(PIPE_A) | BIT(PIPE_B);
intel_encoder->pipe_mask = BIT(PIPE_A) | BIT(PIPE_B);
} else {
intel_encoder->crtc_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C);
intel_encoder->pipe_mask = ~0;
}
intel_encoder->cloneable = 0;
intel_encoder->port = port;

View File

@ -600,8 +600,6 @@ intel_dp_create_fake_mst_encoder(struct intel_digital_port *intel_dig_port, enum
struct intel_dp_mst_encoder *intel_mst;
struct intel_encoder *intel_encoder;
struct drm_device *dev = intel_dig_port->base.base.dev;
struct drm_i915_private *dev_priv = to_i915(dev);
enum pipe pipe_iter;
intel_mst = kzalloc(sizeof(*intel_mst), GFP_KERNEL);
@ -619,8 +617,15 @@ intel_dp_create_fake_mst_encoder(struct intel_digital_port *intel_dig_port, enum
intel_encoder->power_domain = intel_dig_port->base.power_domain;
intel_encoder->port = intel_dig_port->base.port;
intel_encoder->cloneable = 0;
for_each_pipe(dev_priv, pipe_iter)
intel_encoder->crtc_mask |= BIT(pipe_iter);
/*
* This is wrong, but broken userspace uses the intersection
* of possible_crtcs of all the encoders of a given connector
* to figure out which crtcs can drive said connector. What
* should be used instead is the union of possible_crtcs.
* To keep such userspace functioning we must misconfigure
* this to make sure the intersection is not empty :(
*/
intel_encoder->pipe_mask = ~0;
intel_encoder->compute_config = intel_dp_mst_compute_config;
intel_encoder->disable = intel_mst_disable_dp;

View File

@ -526,16 +526,31 @@ static void hsw_ddi_wrpll_disable(struct drm_i915_private *dev_priv,
val = I915_READ(WRPLL_CTL(id));
I915_WRITE(WRPLL_CTL(id), val & ~WRPLL_PLL_ENABLE);
POSTING_READ(WRPLL_CTL(id));
/*
* Try to set up the PCH reference clock once all DPLLs
* that depend on it have been shut down.
*/
if (dev_priv->pch_ssc_use & BIT(id))
intel_init_pch_refclk(dev_priv);
}
static void hsw_ddi_spll_disable(struct drm_i915_private *dev_priv,
struct intel_shared_dpll *pll)
{
enum intel_dpll_id id = pll->info->id;
u32 val;
val = I915_READ(SPLL_CTL);
I915_WRITE(SPLL_CTL, val & ~SPLL_PLL_ENABLE);
POSTING_READ(SPLL_CTL);
/*
* Try to set up the PCH reference clock once all DPLLs
* that depend on it have been shut down.
*/
if (dev_priv->pch_ssc_use & BIT(id))
intel_init_pch_refclk(dev_priv);
}
static bool hsw_ddi_wrpll_get_hw_state(struct drm_i915_private *dev_priv,

View File

@ -147,11 +147,11 @@ enum intel_dpll_id {
*/
DPLL_ID_ICL_MGPLL4 = 6,
/**
* @DPLL_ID_TGL_TCPLL5: TGL TC PLL port 5 (TC5)
* @DPLL_ID_TGL_MGPLL5: TGL TC PLL port 5 (TC5)
*/
DPLL_ID_TGL_MGPLL5 = 7,
/**
* @DPLL_ID_TGL_TCPLL6: TGL TC PLL port 6 (TC6)
* @DPLL_ID_TGL_MGPLL6: TGL TC PLL port 6 (TC6)
*/
DPLL_ID_TGL_MGPLL6 = 8,
};
@ -337,6 +337,11 @@ struct intel_shared_dpll {
* @info: platform specific info
*/
const struct dpll_info *info;
/**
* @wakeref: In some platforms a device-level runtime pm reference may
* need to be grabbed to disable DC states while this DPLL is enabled
*/
intel_wakeref_t wakeref;
};

View File

@ -505,7 +505,7 @@ void intel_dvo_init(struct drm_i915_private *dev_priv)
intel_encoder->type = INTEL_OUTPUT_DVO;
intel_encoder->power_domain = POWER_DOMAIN_PORT_OTHER;
intel_encoder->port = port;
intel_encoder->crtc_mask = BIT(PIPE_A) | BIT(PIPE_B);
intel_encoder->pipe_mask = ~0;
switch (dvo->type) {
case INTEL_DVO_CHIP_TMDS:

View File

@ -922,7 +922,7 @@ static void intel_hdcp_prop_work(struct work_struct *work)
bool is_hdcp_supported(struct drm_i915_private *dev_priv, enum port port)
{
/* PORT E doesn't have HDCP, and PORT F is disabled */
return INTEL_GEN(dev_priv) >= 9 && port < PORT_E;
return INTEL_INFO(dev_priv)->display.has_hdcp && port < PORT_E;
}
static int

View File

@ -2626,6 +2626,12 @@ out:
if (status != connector_status_connected)
cec_notifier_phys_addr_invalidate(intel_hdmi->cec_notifier);
/*
* Make sure the refs for power wells enabled during detect are
* dropped to avoid a new detect cycle triggered by HPD polling.
*/
intel_display_power_flush_work(dev_priv);
return status;
}
@ -3277,11 +3283,11 @@ void intel_hdmi_init(struct drm_i915_private *dev_priv,
intel_encoder->port = port;
if (IS_CHERRYVIEW(dev_priv)) {
if (port == PORT_D)
intel_encoder->crtc_mask = BIT(PIPE_C);
intel_encoder->pipe_mask = BIT(PIPE_C);
else
intel_encoder->crtc_mask = BIT(PIPE_A) | BIT(PIPE_B);
intel_encoder->pipe_mask = BIT(PIPE_A) | BIT(PIPE_B);
} else {
intel_encoder->crtc_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C);
intel_encoder->pipe_mask = ~0;
}
intel_encoder->cloneable = 1 << INTEL_OUTPUT_ANALOG;
/*

View File

@ -899,12 +899,10 @@ void intel_lvds_init(struct drm_i915_private *dev_priv)
intel_encoder->power_domain = POWER_DOMAIN_PORT_OTHER;
intel_encoder->port = PORT_NONE;
intel_encoder->cloneable = 0;
if (HAS_PCH_SPLIT(dev_priv))
intel_encoder->crtc_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C);
else if (IS_GEN(dev_priv, 4))
intel_encoder->crtc_mask = BIT(PIPE_A) | BIT(PIPE_B);
if (INTEL_GEN(dev_priv) < 4)
intel_encoder->pipe_mask = BIT(PIPE_B);
else
intel_encoder->crtc_mask = BIT(PIPE_B);
intel_encoder->pipe_mask = ~0;
drm_connector_helper_add(connector, &intel_lvds_connector_helper_funcs);
connector->display_info.subpixel_order = SubPixelHorizontalRGB;

View File

@ -30,6 +30,7 @@
#include <drm/i915_drm.h>
#include "gem/i915_gem_pm.h"
#include "gt/intel_ring.h"
#include "i915_drv.h"
#include "i915_reg.h"

View File

@ -76,7 +76,7 @@ static bool intel_psr2_enabled(struct drm_i915_private *dev_priv,
const struct intel_crtc_state *crtc_state)
{
/* Cannot enable DSC and PSR2 simultaneously */
WARN_ON(crtc_state->dsc_params.compression_enable &&
WARN_ON(crtc_state->dsc.compression_enable &&
crtc_state->has_psr2);
switch (dev_priv->psr.debug & I915_PSR_DEBUG_MODE_MASK) {
@ -623,7 +623,7 @@ static bool intel_psr2_config_valid(struct intel_dp *intel_dp,
* resolution requires DSC to be enabled, priority is given to DSC
* over PSR2.
*/
if (crtc_state->dsc_params.compression_enable) {
if (crtc_state->dsc.compression_enable) {
DRM_DEBUG_KMS("PSR2 cannot be enabled since DSC is enabled\n");
return false;
}
@ -740,25 +740,6 @@ static void intel_psr_activate(struct intel_dp *intel_dp)
dev_priv->psr.active = true;
}
static i915_reg_t gen9_chicken_trans_reg(struct drm_i915_private *dev_priv,
enum transcoder cpu_transcoder)
{
static const i915_reg_t regs[] = {
[TRANSCODER_A] = CHICKEN_TRANS_A,
[TRANSCODER_B] = CHICKEN_TRANS_B,
[TRANSCODER_C] = CHICKEN_TRANS_C,
[TRANSCODER_EDP] = CHICKEN_TRANS_EDP,
};
WARN_ON(INTEL_GEN(dev_priv) < 9);
if (WARN_ON(cpu_transcoder >= ARRAY_SIZE(regs) ||
!regs[cpu_transcoder].reg))
cpu_transcoder = TRANSCODER_A;
return regs[cpu_transcoder];
}
static void intel_psr_enable_source(struct intel_dp *intel_dp,
const struct intel_crtc_state *crtc_state)
{
@ -774,8 +755,7 @@ static void intel_psr_enable_source(struct intel_dp *intel_dp,
if (dev_priv->psr.psr2_enabled && (IS_GEN(dev_priv, 9) &&
!IS_GEMINILAKE(dev_priv))) {
i915_reg_t reg = gen9_chicken_trans_reg(dev_priv,
cpu_transcoder);
i915_reg_t reg = CHICKEN_TRANS(cpu_transcoder);
u32 chicken = I915_READ(reg);
chicken |= PSR2_VSC_ENABLE_PROG_HEADER |
@ -1437,7 +1417,7 @@ void intel_psr_short_pulse(struct intel_dp *intel_dp)
if (val & DP_PSR_VSC_SDP_UNCORRECTABLE_ERROR)
DRM_DEBUG_KMS("PSR VSC SDP uncorrectable error, disabling PSR\n");
if (val & DP_PSR_LINK_CRC_ERROR)
DRM_ERROR("PSR Link CRC error, disabling PSR\n");
DRM_DEBUG_KMS("PSR Link CRC error, disabling PSR\n");
if (val & ~errors)
DRM_ERROR("PSR_ERROR_STATUS unhandled errors %x\n",

View File

@ -2921,7 +2921,7 @@ intel_sdvo_output_setup(struct intel_sdvo *intel_sdvo, u16 flags)
bytes[0], bytes[1]);
return false;
}
intel_sdvo->base.crtc_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C);
intel_sdvo->base.pipe_mask = ~0;
return true;
}

View File

@ -322,6 +322,55 @@ bool icl_is_hdr_plane(struct drm_i915_private *dev_priv, enum plane_id plane_id)
icl_hdr_plane_mask() & BIT(plane_id);
}
static void
skl_plane_ratio(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state,
unsigned int *num, unsigned int *den)
{
struct drm_i915_private *dev_priv = to_i915(plane_state->base.plane->dev);
const struct drm_framebuffer *fb = plane_state->base.fb;
if (fb->format->cpp[0] == 8) {
if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv)) {
*num = 10;
*den = 8;
} else {
*num = 9;
*den = 8;
}
} else {
*num = 1;
*den = 1;
}
}
static int skl_plane_min_cdclk(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state)
{
struct drm_i915_private *dev_priv = to_i915(plane_state->base.plane->dev);
unsigned int pixel_rate = crtc_state->pixel_rate;
unsigned int src_w, src_h, dst_w, dst_h;
unsigned int num, den;
skl_plane_ratio(crtc_state, plane_state, &num, &den);
/* two pixels per clock on glk+ */
if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv))
den *= 2;
src_w = drm_rect_width(&plane_state->base.src) >> 16;
src_h = drm_rect_height(&plane_state->base.src) >> 16;
dst_w = drm_rect_width(&plane_state->base.dst);
dst_h = drm_rect_height(&plane_state->base.dst);
/* Downscaling limits the maximum pixel rate */
dst_w = min(src_w, dst_w);
dst_h = min(src_h, dst_h);
return DIV64_U64_ROUND_UP(mul_u32_u32(pixel_rate * num, src_w * src_h),
mul_u32_u32(den, dst_w * dst_h));
}
static unsigned int
skl_plane_max_stride(struct intel_plane *plane,
u32 pixel_format, u64 modifier,
@ -811,6 +860,85 @@ vlv_update_clrc(const struct intel_plane_state *plane_state)
SP_SH_SIN(sh_sin) | SP_SH_COS(sh_cos));
}
static void
vlv_plane_ratio(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state,
unsigned int *num, unsigned int *den)
{
u8 active_planes = crtc_state->active_planes & ~BIT(PLANE_CURSOR);
const struct drm_framebuffer *fb = plane_state->base.fb;
unsigned int cpp = fb->format->cpp[0];
/*
* VLV bspec only considers cases where all three planes are
* enabled, and cases where the primary and one sprite is enabled.
* Let's assume the case with just two sprites enabled also
* maps to the latter case.
*/
if (hweight8(active_planes) == 3) {
switch (cpp) {
case 8:
*num = 11;
*den = 8;
break;
case 4:
*num = 18;
*den = 16;
break;
default:
*num = 1;
*den = 1;
break;
}
} else if (hweight8(active_planes) == 2) {
switch (cpp) {
case 8:
*num = 10;
*den = 8;
break;
case 4:
*num = 17;
*den = 16;
break;
default:
*num = 1;
*den = 1;
break;
}
} else {
switch (cpp) {
case 8:
*num = 10;
*den = 8;
break;
default:
*num = 1;
*den = 1;
break;
}
}
}
int vlv_plane_min_cdclk(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state)
{
unsigned int pixel_rate;
unsigned int num, den;
/*
* Note that crtc_state->pixel_rate accounts for both
* horizontal and vertical panel fitter downscaling factors.
* Pre-HSW bspec tells us to only consider the horizontal
* downscaling factor here. We ignore that and just consider
* both for simplicity.
*/
pixel_rate = crtc_state->pixel_rate;
vlv_plane_ratio(crtc_state, plane_state, &num, &den);
return DIV_ROUND_UP(pixel_rate * num, den);
}
static u32 vlv_sprite_ctl_crtc(const struct intel_crtc_state *crtc_state)
{
u32 sprctl = 0;
@ -1017,6 +1145,164 @@ vlv_plane_get_hw_state(struct intel_plane *plane,
return ret;
}
static void ivb_plane_ratio(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state,
unsigned int *num, unsigned int *den)
{
u8 active_planes = crtc_state->active_planes & ~BIT(PLANE_CURSOR);
const struct drm_framebuffer *fb = plane_state->base.fb;
unsigned int cpp = fb->format->cpp[0];
if (hweight8(active_planes) == 2) {
switch (cpp) {
case 8:
*num = 10;
*den = 8;
break;
case 4:
*num = 17;
*den = 16;
break;
default:
*num = 1;
*den = 1;
break;
}
} else {
switch (cpp) {
case 8:
*num = 9;
*den = 8;
break;
default:
*num = 1;
*den = 1;
break;
}
}
}
static void ivb_plane_ratio_scaling(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state,
unsigned int *num, unsigned int *den)
{
const struct drm_framebuffer *fb = plane_state->base.fb;
unsigned int cpp = fb->format->cpp[0];
switch (cpp) {
case 8:
*num = 12;
*den = 8;
break;
case 4:
*num = 19;
*den = 16;
break;
case 2:
*num = 33;
*den = 32;
break;
default:
*num = 1;
*den = 1;
break;
}
}
int ivb_plane_min_cdclk(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state)
{
unsigned int pixel_rate;
unsigned int num, den;
/*
* Note that crtc_state->pixel_rate accounts for both
* horizontal and vertical panel fitter downscaling factors.
* Pre-HSW bspec tells us to only consider the horizontal
* downscaling factor here. We ignore that and just consider
* both for simplicity.
*/
pixel_rate = crtc_state->pixel_rate;
ivb_plane_ratio(crtc_state, plane_state, &num, &den);
return DIV_ROUND_UP(pixel_rate * num, den);
}
static int ivb_sprite_min_cdclk(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state)
{
unsigned int src_w, dst_w, pixel_rate;
unsigned int num, den;
/*
* Note that crtc_state->pixel_rate accounts for both
* horizontal and vertical panel fitter downscaling factors.
* Pre-HSW bspec tells us to only consider the horizontal
* downscaling factor here. We ignore that and just consider
* both for simplicity.
*/
pixel_rate = crtc_state->pixel_rate;
src_w = drm_rect_width(&plane_state->base.src) >> 16;
dst_w = drm_rect_width(&plane_state->base.dst);
if (src_w != dst_w)
ivb_plane_ratio_scaling(crtc_state, plane_state, &num, &den);
else
ivb_plane_ratio(crtc_state, plane_state, &num, &den);
/* Horizontal downscaling limits the maximum pixel rate */
dst_w = min(src_w, dst_w);
return DIV_ROUND_UP_ULL(mul_u32_u32(pixel_rate, num * src_w),
den * dst_w);
}
static void hsw_plane_ratio(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state,
unsigned int *num, unsigned int *den)
{
u8 active_planes = crtc_state->active_planes & ~BIT(PLANE_CURSOR);
const struct drm_framebuffer *fb = plane_state->base.fb;
unsigned int cpp = fb->format->cpp[0];
if (hweight8(active_planes) == 2) {
switch (cpp) {
case 8:
*num = 10;
*den = 8;
break;
default:
*num = 1;
*den = 1;
break;
}
} else {
switch (cpp) {
case 8:
*num = 9;
*den = 8;
break;
default:
*num = 1;
*den = 1;
break;
}
}
}
int hsw_plane_min_cdclk(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state)
{
unsigned int pixel_rate = crtc_state->pixel_rate;
unsigned int num, den;
hsw_plane_ratio(crtc_state, plane_state, &num, &den);
return DIV_ROUND_UP(pixel_rate * num, den);
}
static u32 ivb_sprite_ctl_crtc(const struct intel_crtc_state *crtc_state)
{
u32 sprctl = 0;
@ -1030,6 +1316,16 @@ static u32 ivb_sprite_ctl_crtc(const struct intel_crtc_state *crtc_state)
return sprctl;
}
static bool ivb_need_sprite_gamma(const struct intel_plane_state *plane_state)
{
struct drm_i915_private *dev_priv =
to_i915(plane_state->base.plane->dev);
const struct drm_framebuffer *fb = plane_state->base.fb;
return fb->format->cpp[0] == 8 &&
(IS_IVYBRIDGE(dev_priv) || IS_HASWELL(dev_priv));
}
static u32 ivb_sprite_ctl(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state)
{
@ -1052,6 +1348,12 @@ static u32 ivb_sprite_ctl(const struct intel_crtc_state *crtc_state,
case DRM_FORMAT_XRGB8888:
sprctl |= SPRITE_FORMAT_RGBX888;
break;
case DRM_FORMAT_XBGR16161616F:
sprctl |= SPRITE_FORMAT_RGBX161616 | SPRITE_RGB_ORDER_RGBX;
break;
case DRM_FORMAT_XRGB16161616F:
sprctl |= SPRITE_FORMAT_RGBX161616;
break;
case DRM_FORMAT_YUYV:
sprctl |= SPRITE_FORMAT_YUV422 | SPRITE_YUV_ORDER_YUYV;
break;
@ -1069,7 +1371,8 @@ static u32 ivb_sprite_ctl(const struct intel_crtc_state *crtc_state,
return 0;
}
sprctl |= SPRITE_INT_GAMMA_DISABLE;
if (!ivb_need_sprite_gamma(plane_state))
sprctl |= SPRITE_INT_GAMMA_DISABLE;
if (plane_state->base.color_encoding == DRM_COLOR_YCBCR_BT709)
sprctl |= SPRITE_YUV_TO_RGB_CSC_FORMAT_BT709;
@ -1091,12 +1394,26 @@ static u32 ivb_sprite_ctl(const struct intel_crtc_state *crtc_state,
return sprctl;
}
static void ivb_sprite_linear_gamma(u16 gamma[18])
static void ivb_sprite_linear_gamma(const struct intel_plane_state *plane_state,
u16 gamma[18])
{
int i;
int scale, i;
for (i = 0; i < 17; i++)
gamma[i] = (i << 10) / 16;
/*
* WaFP16GammaEnabling:ivb,hsw
* "Workaround : When using the 64-bit format, the sprite output
* on each color channel has one quarter amplitude. It can be
* brought up to full amplitude by using sprite internal gamma
* correction, pipe gamma correction, or pipe color space
* conversion to multiply the sprite output by four."
*/
scale = 4;
for (i = 0; i < 16; i++)
gamma[i] = min((scale * i << 10) / 16, (1 << 10) - 1);
gamma[i] = min((scale * i << 10) / 16, 1 << 10);
i++;
gamma[i] = 3 << 10;
i++;
@ -1110,7 +1427,10 @@ static void ivb_update_gamma(const struct intel_plane_state *plane_state)
u16 gamma[18];
int i;
ivb_sprite_linear_gamma(gamma);
if (!ivb_need_sprite_gamma(plane_state))
return;
ivb_sprite_linear_gamma(plane_state, gamma);
/* FIXME these register are single buffered :( */
for (i = 0; i < 16; i++)
@ -1243,6 +1563,53 @@ ivb_plane_get_hw_state(struct intel_plane *plane,
return ret;
}
static int g4x_sprite_min_cdclk(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state)
{
const struct drm_framebuffer *fb = plane_state->base.fb;
unsigned int hscale, pixel_rate;
unsigned int limit, decimate;
/*
* Note that crtc_state->pixel_rate accounts for both
* horizontal and vertical panel fitter downscaling factors.
* Pre-HSW bspec tells us to only consider the horizontal
* downscaling factor here. We ignore that and just consider
* both for simplicity.
*/
pixel_rate = crtc_state->pixel_rate;
/* Horizontal downscaling limits the maximum pixel rate */
hscale = drm_rect_calc_hscale(&plane_state->base.src,
&plane_state->base.dst,
0, INT_MAX);
if (hscale < 0x10000)
return pixel_rate;
/* Decimation steps at 2x,4x,8x,16x */
decimate = ilog2(hscale >> 16);
hscale >>= decimate;
/* Starting limit is 90% of cdclk */
limit = 9;
/* -10% per decimation step */
limit -= decimate;
/* -10% for RGB */
if (fb->format->cpp[0] >= 4)
limit--; /* -10% for RGB */
/*
* We should also do -10% if sprite scaling is enabled
* on the other pipe, but we can't really check for that,
* so we ignore it.
*/
return DIV_ROUND_UP_ULL(mul_u32_u32(pixel_rate, 10 * hscale),
limit << 16);
}
static unsigned int
g4x_sprite_max_stride(struct intel_plane *plane,
u32 pixel_format, u64 modifier,
@ -1286,6 +1653,12 @@ static u32 g4x_sprite_ctl(const struct intel_crtc_state *crtc_state,
case DRM_FORMAT_XRGB8888:
dvscntr |= DVS_FORMAT_RGBX888;
break;
case DRM_FORMAT_XBGR16161616F:
dvscntr |= DVS_FORMAT_RGBX161616 | DVS_RGB_ORDER_XBGR;
break;
case DRM_FORMAT_XRGB16161616F:
dvscntr |= DVS_FORMAT_RGBX161616;
break;
case DRM_FORMAT_YUYV:
dvscntr |= DVS_FORMAT_YUV422 | DVS_YUV_ORDER_YUYV;
break;
@ -1499,6 +1872,11 @@ static bool intel_fb_scalable(const struct drm_framebuffer *fb)
switch (fb->format->format) {
case DRM_FORMAT_C8:
return false;
case DRM_FORMAT_XRGB16161616F:
case DRM_FORMAT_ARGB16161616F:
case DRM_FORMAT_XBGR16161616F:
case DRM_FORMAT_ABGR16161616F:
return INTEL_GEN(to_i915(fb->dev)) >= 11;
default:
return true;
}
@ -1787,6 +2165,22 @@ static int skl_plane_check_nv12_rotation(const struct intel_plane_state *plane_s
return 0;
}
static int skl_plane_max_scale(struct drm_i915_private *dev_priv,
const struct drm_framebuffer *fb)
{
/*
* We don't yet know the final source width nor
* whether we can use the HQ scaler mode. Assume
* the best case.
* FIXME need to properly check this later.
*/
if (INTEL_GEN(dev_priv) >= 10 || IS_GEMINILAKE(dev_priv) ||
!drm_format_info_is_yuv_semiplanar(fb->format))
return 0x30000 - 1;
else
return 0x20000 - 1;
}
static int skl_plane_check(struct intel_crtc_state *crtc_state,
struct intel_plane_state *plane_state)
{
@ -1804,7 +2198,7 @@ static int skl_plane_check(struct intel_crtc_state *crtc_state,
/* use scaler when colorkey is not required */
if (!plane_state->ckey.flags && intel_fb_scalable(fb)) {
min_scale = 1;
max_scale = skl_max_scale(crtc_state, fb->format);
max_scale = skl_plane_max_scale(dev_priv, fb);
}
ret = drm_atomic_helper_check_plane_state(&plane_state->base,
@ -1979,8 +2373,10 @@ static const u64 i9xx_plane_format_modifiers[] = {
};
static const u32 snb_plane_formats[] = {
DRM_FORMAT_XBGR8888,
DRM_FORMAT_XRGB8888,
DRM_FORMAT_XBGR8888,
DRM_FORMAT_XRGB16161616F,
DRM_FORMAT_XBGR16161616F,
DRM_FORMAT_YUYV,
DRM_FORMAT_YVYU,
DRM_FORMAT_UYVY,
@ -2010,6 +2406,8 @@ static const u32 skl_plane_formats[] = {
DRM_FORMAT_ABGR8888,
DRM_FORMAT_XRGB2101010,
DRM_FORMAT_XBGR2101010,
DRM_FORMAT_XRGB16161616F,
DRM_FORMAT_XBGR16161616F,
DRM_FORMAT_YUYV,
DRM_FORMAT_YVYU,
DRM_FORMAT_UYVY,
@ -2025,6 +2423,8 @@ static const u32 skl_planar_formats[] = {
DRM_FORMAT_ABGR8888,
DRM_FORMAT_XRGB2101010,
DRM_FORMAT_XBGR2101010,
DRM_FORMAT_XRGB16161616F,
DRM_FORMAT_XBGR16161616F,
DRM_FORMAT_YUYV,
DRM_FORMAT_YVYU,
DRM_FORMAT_UYVY,
@ -2041,6 +2441,8 @@ static const u32 glk_planar_formats[] = {
DRM_FORMAT_ABGR8888,
DRM_FORMAT_XRGB2101010,
DRM_FORMAT_XBGR2101010,
DRM_FORMAT_XRGB16161616F,
DRM_FORMAT_XBGR16161616F,
DRM_FORMAT_YUYV,
DRM_FORMAT_YVYU,
DRM_FORMAT_UYVY,
@ -2191,6 +2593,8 @@ static bool snb_sprite_format_mod_supported(struct drm_plane *_plane,
switch (format) {
case DRM_FORMAT_XRGB8888:
case DRM_FORMAT_XBGR8888:
case DRM_FORMAT_XRGB16161616F:
case DRM_FORMAT_XBGR16161616F:
case DRM_FORMAT_YUYV:
case DRM_FORMAT_YVYU:
case DRM_FORMAT_UYVY:
@ -2511,6 +2915,7 @@ skl_universal_plane_create(struct drm_i915_private *dev_priv,
plane->disable_plane = skl_disable_plane;
plane->get_hw_state = skl_plane_get_hw_state;
plane->check_plane = skl_plane_check;
plane->min_cdclk = skl_plane_min_cdclk;
if (icl_is_nv12_y_plane(plane_id))
plane->update_slave = icl_update_slave;
@ -2618,6 +3023,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
plane->disable_plane = vlv_disable_plane;
plane->get_hw_state = vlv_plane_get_hw_state;
plane->check_plane = vlv_sprite_check;
plane->min_cdclk = vlv_plane_min_cdclk;
formats = vlv_plane_formats;
num_formats = ARRAY_SIZE(vlv_plane_formats);
@ -2631,6 +3037,11 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
plane->get_hw_state = ivb_plane_get_hw_state;
plane->check_plane = g4x_sprite_check;
if (IS_BROADWELL(dev_priv) || IS_HASWELL(dev_priv))
plane->min_cdclk = hsw_plane_min_cdclk;
else
plane->min_cdclk = ivb_sprite_min_cdclk;
formats = snb_plane_formats;
num_formats = ARRAY_SIZE(snb_plane_formats);
modifiers = i9xx_plane_format_modifiers;
@ -2642,6 +3053,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
plane->disable_plane = g4x_disable_plane;
plane->get_hw_state = g4x_plane_get_hw_state;
plane->check_plane = g4x_sprite_check;
plane->min_cdclk = g4x_sprite_min_cdclk;
modifiers = i9xx_plane_format_modifiers;
if (IS_GEN(dev_priv, 6)) {

View File

@ -49,4 +49,11 @@ static inline u8 icl_hdr_plane_mask(void)
bool icl_is_hdr_plane(struct drm_i915_private *dev_priv, enum plane_id plane_id);
int ivb_plane_min_cdclk(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state);
int hsw_plane_min_cdclk(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state);
int vlv_plane_min_cdclk(const struct intel_crtc_state *crtc_state,
const struct intel_plane_state *plane_state);
#endif /* __INTEL_SPRITE_H__ */

View File

@ -1701,7 +1701,7 @@ intel_tv_detect(struct drm_connector *connector,
struct intel_load_detect_pipe tmp;
int ret;
ret = intel_get_load_detect_pipe(connector, NULL, &tmp, ctx);
ret = intel_get_load_detect_pipe(connector, &tmp, ctx);
if (ret < 0)
return ret;
@ -1947,7 +1947,7 @@ intel_tv_init(struct drm_i915_private *dev_priv)
intel_encoder->type = INTEL_OUTPUT_TVOUT;
intel_encoder->power_domain = POWER_DOMAIN_PORT_OTHER;
intel_encoder->port = PORT_NONE;
intel_encoder->crtc_mask = BIT(PIPE_A) | BIT(PIPE_B);
intel_encoder->pipe_mask = ~0;
intel_encoder->cloneable = 0;
intel_tv->type = DRM_MODE_CONNECTOR_Unknown;

View File

@ -114,6 +114,7 @@ enum bdb_block_id {
BDB_LVDS_POWER = 44,
BDB_MIPI_CONFIG = 52,
BDB_MIPI_SEQUENCE = 53,
BDB_COMPRESSION_PARAMETERS = 56,
BDB_SKIP = 254, /* VBIOS private block, ignore */
};
@ -811,4 +812,55 @@ struct bdb_mipi_sequence {
u8 data[0]; /* up to 6 variable length blocks */
} __packed;
/*
* Block 56 - Compression Parameters
*/
#define VBT_RC_BUFFER_BLOCK_SIZE_1KB 0
#define VBT_RC_BUFFER_BLOCK_SIZE_4KB 1
#define VBT_RC_BUFFER_BLOCK_SIZE_16KB 2
#define VBT_RC_BUFFER_BLOCK_SIZE_64KB 3
#define VBT_DSC_LINE_BUFFER_DEPTH(vbt_value) ((vbt_value) + 8) /* bits */
#define VBT_DSC_MAX_BPP(vbt_value) (6 + (vbt_value) * 2)
struct dsc_compression_parameters_entry {
u8 version_major:4;
u8 version_minor:4;
u8 rc_buffer_block_size:2;
u8 reserved1:6;
/*
* Buffer size in bytes:
*
* 4 ^ rc_buffer_block_size * 1024 * (rc_buffer_size + 1) bytes
*/
u8 rc_buffer_size;
u32 slices_per_line;
u8 line_buffer_depth:4;
u8 reserved2:4;
/* Flag Bits 1 */
u8 block_prediction_enable:1;
u8 reserved3:7;
u8 max_bpp; /* mapping */
/* Color depth capabilities */
u8 reserved4:1;
u8 support_8bpc:1;
u8 support_10bpc:1;
u8 support_12bpc:1;
u8 reserved5:4;
u16 slice_height;
} __packed;
struct bdb_compression_parameters {
u16 entry_size;
struct dsc_compression_parameters_entry data[16];
} __packed;
#endif /* _INTEL_VBT_DEFS_H_ */

View File

@ -322,8 +322,8 @@ static int get_column_index_for_rc_params(u8 bits_per_component)
int intel_dp_compute_dsc_params(struct intel_dp *intel_dp,
struct intel_crtc_state *pipe_config)
{
struct drm_dsc_config *vdsc_cfg = &pipe_config->dp_dsc_cfg;
u16 compressed_bpp = pipe_config->dsc_params.compressed_bpp;
struct drm_dsc_config *vdsc_cfg = &pipe_config->dsc.config;
u16 compressed_bpp = pipe_config->dsc.compressed_bpp;
u8 i = 0;
int row_index = 0;
int column_index = 0;
@ -332,7 +332,7 @@ int intel_dp_compute_dsc_params(struct intel_dp *intel_dp,
vdsc_cfg->pic_width = pipe_config->base.adjusted_mode.crtc_hdisplay;
vdsc_cfg->pic_height = pipe_config->base.adjusted_mode.crtc_vdisplay;
vdsc_cfg->slice_width = DIV_ROUND_UP(vdsc_cfg->pic_width,
pipe_config->dsc_params.slice_count);
pipe_config->dsc.slice_count);
/*
* Slice Height of 8 works for all currently available panels. So start
* with that if pic_height is an integral multiple of 8.
@ -485,13 +485,13 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
{
struct intel_crtc *crtc = to_intel_crtc(crtc_state->base.crtc);
struct drm_i915_private *dev_priv = to_i915(encoder->base.dev);
const struct drm_dsc_config *vdsc_cfg = &crtc_state->dp_dsc_cfg;
const struct drm_dsc_config *vdsc_cfg = &crtc_state->dsc.config;
enum pipe pipe = crtc->pipe;
enum transcoder cpu_transcoder = crtc_state->cpu_transcoder;
u32 pps_val = 0;
u32 rc_buf_thresh_dword[4];
u32 rc_range_params_dword[8];
u8 num_vdsc_instances = (crtc_state->dsc_params.dsc_split) ? 2 : 1;
u8 num_vdsc_instances = (crtc_state->dsc.dsc_split) ? 2 : 1;
int i = 0;
/* Populate PICTURE_PARAMETER_SET_0 registers */
@ -514,11 +514,11 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
* If 2 VDSC instances are needed, configure PPS for second
* VDSC
*/
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(DSCC_PICTURE_PARAMETER_SET_0, pps_val);
} else {
I915_WRITE(ICL_DSC0_PICTURE_PARAMETER_SET_0(pipe), pps_val);
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(ICL_DSC1_PICTURE_PARAMETER_SET_0(pipe),
pps_val);
}
@ -533,11 +533,11 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
* If 2 VDSC instances are needed, configure PPS for second
* VDSC
*/
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(DSCC_PICTURE_PARAMETER_SET_1, pps_val);
} else {
I915_WRITE(ICL_DSC0_PICTURE_PARAMETER_SET_1(pipe), pps_val);
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(ICL_DSC1_PICTURE_PARAMETER_SET_1(pipe),
pps_val);
}
@ -553,11 +553,11 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
* If 2 VDSC instances are needed, configure PPS for second
* VDSC
*/
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(DSCC_PICTURE_PARAMETER_SET_2, pps_val);
} else {
I915_WRITE(ICL_DSC0_PICTURE_PARAMETER_SET_2(pipe), pps_val);
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(ICL_DSC1_PICTURE_PARAMETER_SET_2(pipe),
pps_val);
}
@ -573,11 +573,11 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
* If 2 VDSC instances are needed, configure PPS for second
* VDSC
*/
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(DSCC_PICTURE_PARAMETER_SET_3, pps_val);
} else {
I915_WRITE(ICL_DSC0_PICTURE_PARAMETER_SET_3(pipe), pps_val);
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(ICL_DSC1_PICTURE_PARAMETER_SET_3(pipe),
pps_val);
}
@ -593,11 +593,11 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
* If 2 VDSC instances are needed, configure PPS for second
* VDSC
*/
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(DSCC_PICTURE_PARAMETER_SET_4, pps_val);
} else {
I915_WRITE(ICL_DSC0_PICTURE_PARAMETER_SET_4(pipe), pps_val);
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(ICL_DSC1_PICTURE_PARAMETER_SET_4(pipe),
pps_val);
}
@ -613,11 +613,11 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
* If 2 VDSC instances are needed, configure PPS for second
* VDSC
*/
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(DSCC_PICTURE_PARAMETER_SET_5, pps_val);
} else {
I915_WRITE(ICL_DSC0_PICTURE_PARAMETER_SET_5(pipe), pps_val);
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(ICL_DSC1_PICTURE_PARAMETER_SET_5(pipe),
pps_val);
}
@ -635,11 +635,11 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
* If 2 VDSC instances are needed, configure PPS for second
* VDSC
*/
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(DSCC_PICTURE_PARAMETER_SET_6, pps_val);
} else {
I915_WRITE(ICL_DSC0_PICTURE_PARAMETER_SET_6(pipe), pps_val);
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(ICL_DSC1_PICTURE_PARAMETER_SET_6(pipe),
pps_val);
}
@ -655,11 +655,11 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
* If 2 VDSC instances are needed, configure PPS for second
* VDSC
*/
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(DSCC_PICTURE_PARAMETER_SET_7, pps_val);
} else {
I915_WRITE(ICL_DSC0_PICTURE_PARAMETER_SET_7(pipe), pps_val);
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(ICL_DSC1_PICTURE_PARAMETER_SET_7(pipe),
pps_val);
}
@ -675,11 +675,11 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
* If 2 VDSC instances are needed, configure PPS for second
* VDSC
*/
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(DSCC_PICTURE_PARAMETER_SET_8, pps_val);
} else {
I915_WRITE(ICL_DSC0_PICTURE_PARAMETER_SET_8(pipe), pps_val);
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(ICL_DSC1_PICTURE_PARAMETER_SET_8(pipe),
pps_val);
}
@ -695,11 +695,11 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
* If 2 VDSC instances are needed, configure PPS for second
* VDSC
*/
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(DSCC_PICTURE_PARAMETER_SET_9, pps_val);
} else {
I915_WRITE(ICL_DSC0_PICTURE_PARAMETER_SET_9(pipe), pps_val);
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(ICL_DSC1_PICTURE_PARAMETER_SET_9(pipe),
pps_val);
}
@ -717,11 +717,11 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
* If 2 VDSC instances are needed, configure PPS for second
* VDSC
*/
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(DSCC_PICTURE_PARAMETER_SET_10, pps_val);
} else {
I915_WRITE(ICL_DSC0_PICTURE_PARAMETER_SET_10(pipe), pps_val);
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(ICL_DSC1_PICTURE_PARAMETER_SET_10(pipe),
pps_val);
}
@ -740,11 +740,11 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
* If 2 VDSC instances are needed, configure PPS for second
* VDSC
*/
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(DSCC_PICTURE_PARAMETER_SET_16, pps_val);
} else {
I915_WRITE(ICL_DSC0_PICTURE_PARAMETER_SET_16(pipe), pps_val);
if (crtc_state->dsc_params.dsc_split)
if (crtc_state->dsc.dsc_split)
I915_WRITE(ICL_DSC1_PICTURE_PARAMETER_SET_16(pipe),
pps_val);
}
@ -763,7 +763,7 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
I915_WRITE(DSCA_RC_BUF_THRESH_0_UDW, rc_buf_thresh_dword[1]);
I915_WRITE(DSCA_RC_BUF_THRESH_1, rc_buf_thresh_dword[2]);
I915_WRITE(DSCA_RC_BUF_THRESH_1_UDW, rc_buf_thresh_dword[3]);
if (crtc_state->dsc_params.dsc_split) {
if (crtc_state->dsc.dsc_split) {
I915_WRITE(DSCC_RC_BUF_THRESH_0,
rc_buf_thresh_dword[0]);
I915_WRITE(DSCC_RC_BUF_THRESH_0_UDW,
@ -782,7 +782,7 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
rc_buf_thresh_dword[2]);
I915_WRITE(ICL_DSC0_RC_BUF_THRESH_1_UDW(pipe),
rc_buf_thresh_dword[3]);
if (crtc_state->dsc_params.dsc_split) {
if (crtc_state->dsc.dsc_split) {
I915_WRITE(ICL_DSC1_RC_BUF_THRESH_0(pipe),
rc_buf_thresh_dword[0]);
I915_WRITE(ICL_DSC1_RC_BUF_THRESH_0_UDW(pipe),
@ -824,7 +824,7 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
rc_range_params_dword[6]);
I915_WRITE(DSCA_RC_RANGE_PARAMETERS_3_UDW,
rc_range_params_dword[7]);
if (crtc_state->dsc_params.dsc_split) {
if (crtc_state->dsc.dsc_split) {
I915_WRITE(DSCC_RC_RANGE_PARAMETERS_0,
rc_range_params_dword[0]);
I915_WRITE(DSCC_RC_RANGE_PARAMETERS_0_UDW,
@ -859,7 +859,7 @@ static void intel_configure_pps_for_dsc_encoder(struct intel_encoder *encoder,
rc_range_params_dword[6]);
I915_WRITE(ICL_DSC0_RC_RANGE_PARAMETERS_3_UDW(pipe),
rc_range_params_dword[7]);
if (crtc_state->dsc_params.dsc_split) {
if (crtc_state->dsc.dsc_split) {
I915_WRITE(ICL_DSC1_RC_RANGE_PARAMETERS_0(pipe),
rc_range_params_dword[0]);
I915_WRITE(ICL_DSC1_RC_RANGE_PARAMETERS_0_UDW(pipe),
@ -885,7 +885,7 @@ static void intel_dp_write_dsc_pps_sdp(struct intel_encoder *encoder,
{
struct intel_dp *intel_dp = enc_to_intel_dp(&encoder->base);
struct intel_digital_port *intel_dig_port = dp_to_dig_port(intel_dp);
const struct drm_dsc_config *vdsc_cfg = &crtc_state->dp_dsc_cfg;
const struct drm_dsc_config *vdsc_cfg = &crtc_state->dsc.config;
struct drm_dsc_pps_infoframe dp_dsc_pps_sdp;
/* Prepare DP SDP PPS header as per DP 1.4 spec, Table 2-123 */
@ -909,7 +909,7 @@ void intel_dsc_enable(struct intel_encoder *encoder,
u32 dss_ctl1_val = 0;
u32 dss_ctl2_val = 0;
if (!crtc_state->dsc_params.compression_enable)
if (!crtc_state->dsc.compression_enable)
return;
/* Enable Power wells for VDSC/joining */
@ -928,7 +928,7 @@ void intel_dsc_enable(struct intel_encoder *encoder,
dss_ctl2_reg = ICL_PIPE_DSS_CTL2(pipe);
}
dss_ctl2_val |= LEFT_BRANCH_VDSC_ENABLE;
if (crtc_state->dsc_params.dsc_split) {
if (crtc_state->dsc.dsc_split) {
dss_ctl2_val |= RIGHT_BRANCH_VDSC_ENABLE;
dss_ctl1_val |= JOINER_ENABLE;
}
@ -944,7 +944,7 @@ void intel_dsc_disable(const struct intel_crtc_state *old_crtc_state)
i915_reg_t dss_ctl1_reg, dss_ctl2_reg;
u32 dss_ctl1_val = 0, dss_ctl2_val = 0;
if (!old_crtc_state->dsc_params.compression_enable)
if (!old_crtc_state->dsc.compression_enable)
return;
if (old_crtc_state->cpu_transcoder == TRANSCODER_EDP) {

View File

@ -1870,11 +1870,11 @@ void vlv_dsi_init(struct drm_i915_private *dev_priv)
* port C. BXT isn't limited like this.
*/
if (IS_GEN9_LP(dev_priv))
intel_encoder->crtc_mask = BIT(PIPE_A) | BIT(PIPE_B) | BIT(PIPE_C);
intel_encoder->pipe_mask = ~0;
else if (port == PORT_A)
intel_encoder->crtc_mask = BIT(PIPE_A);
intel_encoder->pipe_mask = BIT(PIPE_A);
else
intel_encoder->crtc_mask = BIT(PIPE_B);
intel_encoder->pipe_mask = BIT(PIPE_B);
if (dev_priv->vbt.dsi.config->dual_link)
intel_dsi->ports = BIT(PORT_A) | BIT(PORT_C);

View File

@ -69,8 +69,10 @@
#include <drm/i915_drm.h>
#include "gt/intel_lrc_reg.h"
#include "gt/intel_engine_heartbeat.h"
#include "gt/intel_engine_user.h"
#include "gt/intel_lrc_reg.h"
#include "gt/intel_ring.h"
#include "i915_gem_context.h"
#include "i915_globals.h"
@ -276,6 +278,153 @@ void i915_gem_context_release(struct kref *ref)
schedule_work(&gc->free_work);
}
static inline struct i915_gem_engines *
__context_engines_static(const struct i915_gem_context *ctx)
{
return rcu_dereference_protected(ctx->engines, true);
}
static bool __reset_engine(struct intel_engine_cs *engine)
{
struct intel_gt *gt = engine->gt;
bool success = false;
if (!intel_has_reset_engine(gt))
return false;
if (!test_and_set_bit(I915_RESET_ENGINE + engine->id,
&gt->reset.flags)) {
success = intel_engine_reset(engine, NULL) == 0;
clear_and_wake_up_bit(I915_RESET_ENGINE + engine->id,
&gt->reset.flags);
}
return success;
}
static void __reset_context(struct i915_gem_context *ctx,
struct intel_engine_cs *engine)
{
intel_gt_handle_error(engine->gt, engine->mask, 0,
"context closure in %s", ctx->name);
}
static bool __cancel_engine(struct intel_engine_cs *engine)
{
/*
* Send a "high priority pulse" down the engine to cause the
* current request to be momentarily preempted. (If it fails to
* be preempted, it will be reset). As we have marked our context
* as banned, any incomplete request, including any running, will
* be skipped following the preemption.
*
* If there is no hangchecking (one of the reasons why we try to
* cancel the context) and no forced preemption, there may be no
* means by which we reset the GPU and evict the persistent hog.
* Ergo if we are unable to inject a preemptive pulse that can
* kill the banned context, we fallback to doing a local reset
* instead.
*/
if (IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT) &&
!intel_engine_pulse(engine))
return true;
/* If we are unable to send a pulse, try resetting this engine. */
return __reset_engine(engine);
}
static struct intel_engine_cs *__active_engine(struct i915_request *rq)
{
struct intel_engine_cs *engine, *locked;
/*
* Serialise with __i915_request_submit() so that it sees
* is-banned?, or we know the request is already inflight.
*/
locked = READ_ONCE(rq->engine);
spin_lock_irq(&locked->active.lock);
while (unlikely(locked != (engine = READ_ONCE(rq->engine)))) {
spin_unlock(&locked->active.lock);
spin_lock(&engine->active.lock);
locked = engine;
}
engine = NULL;
if (i915_request_is_active(rq) && !rq->fence.error)
engine = rq->engine;
spin_unlock_irq(&locked->active.lock);
return engine;
}
static struct intel_engine_cs *active_engine(struct intel_context *ce)
{
struct intel_engine_cs *engine = NULL;
struct i915_request *rq;
if (!ce->timeline)
return NULL;
rcu_read_lock();
list_for_each_entry_reverse(rq, &ce->timeline->requests, link) {
if (i915_request_completed(rq))
break;
/* Check with the backend if the request is inflight */
engine = __active_engine(rq);
if (engine)
break;
}
rcu_read_unlock();
return engine;
}
static void kill_context(struct i915_gem_context *ctx)
{
struct i915_gem_engines_iter it;
struct intel_context *ce;
/*
* If we are already banned, it was due to a guilty request causing
* a reset and the entire context being evicted from the GPU.
*/
if (i915_gem_context_is_banned(ctx))
return;
i915_gem_context_set_banned(ctx);
/*
* Map the user's engine back to the actual engines; one virtual
* engine will be mapped to multiple engines, and using ctx->engine[]
* the same engine may be have multiple instances in the user's map.
* However, we only care about pending requests, so only include
* engines on which there are incomplete requests.
*/
for_each_gem_engine(ce, __context_engines_static(ctx), it) {
struct intel_engine_cs *engine;
/*
* Check the current active state of this context; if we
* are currently executing on the GPU we need to evict
* ourselves. On the other hand, if we haven't yet been
* submitted to the GPU or if everything is complete,
* we have nothing to do.
*/
engine = active_engine(ce);
/* First attempt to gracefully cancel the context */
if (engine && !__cancel_engine(engine))
/*
* If we are unable to send a preemptive pulse to bump
* the context from the GPU, we have to resort to a full
* reset. We hope the collateral damage is worth it.
*/
__reset_context(ctx, engine);
}
}
static void context_close(struct i915_gem_context *ctx)
{
struct i915_address_space *vm;
@ -298,9 +447,47 @@ static void context_close(struct i915_gem_context *ctx)
lut_close(ctx);
mutex_unlock(&ctx->mutex);
/*
* If the user has disabled hangchecking, we can not be sure that
* the batches will ever complete after the context is closed,
* keeping the context and all resources pinned forever. So in this
* case we opt to forcibly kill off all remaining requests on
* context close.
*/
if (!i915_gem_context_is_persistent(ctx) ||
!i915_modparams.enable_hangcheck)
kill_context(ctx);
i915_gem_context_put(ctx);
}
static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
{
if (i915_gem_context_is_persistent(ctx) == state)
return 0;
if (state) {
/*
* Only contexts that are short-lived [that will expire or be
* reset] are allowed to survive past termination. We require
* hangcheck to ensure that the persistent requests are healthy.
*/
if (!i915_modparams.enable_hangcheck)
return -EINVAL;
i915_gem_context_set_persistence(ctx);
} else {
/* To cancel a context we use "preempt-to-idle" */
if (!(ctx->i915->caps.scheduler & I915_SCHEDULER_CAP_PREEMPTION))
return -ENODEV;
i915_gem_context_clear_persistence(ctx);
}
return 0;
}
static struct i915_gem_context *
__create_context(struct drm_i915_private *i915)
{
@ -335,6 +522,7 @@ __create_context(struct drm_i915_private *i915)
i915_gem_context_set_bannable(ctx);
i915_gem_context_set_recoverable(ctx);
__context_set_persistence(ctx, true /* cgroup hook? */);
for (i = 0; i < ARRAY_SIZE(ctx->hang_timestamp); i++)
ctx->hang_timestamp[i] = jiffies - CONTEXT_FAST_HANG_JIFFIES;
@ -491,6 +679,7 @@ i915_gem_context_create_kernel(struct drm_i915_private *i915, int prio)
return ctx;
i915_gem_context_clear_bannable(ctx);
i915_gem_context_set_persistence(ctx);
ctx->sched.priority = I915_USER_PRIORITY(prio);
GEM_BUG_ON(!i915_gem_context_is_kernel(ctx));
@ -1601,6 +1790,16 @@ err_free:
return err;
}
static int
set_persistence(struct i915_gem_context *ctx,
const struct drm_i915_gem_context_param *args)
{
if (args->size)
return -EINVAL;
return __context_set_persistence(ctx, args->value);
}
static int ctx_setparam(struct drm_i915_file_private *fpriv,
struct i915_gem_context *ctx,
struct drm_i915_gem_context_param *args)
@ -1678,6 +1877,10 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
ret = set_engines(ctx, args);
break;
case I915_CONTEXT_PARAM_PERSISTENCE:
ret = set_persistence(ctx, args);
break;
case I915_CONTEXT_PARAM_BAN_PERIOD:
default:
ret = -EINVAL;
@ -2130,6 +2333,11 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
ret = get_engines(ctx, args);
break;
case I915_CONTEXT_PARAM_PERSISTENCE:
args->size = 0;
args->value = i915_gem_context_is_persistent(ctx);
break;
case I915_CONTEXT_PARAM_BAN_PERIOD:
default:
ret = -EINVAL;

View File

@ -76,6 +76,21 @@ static inline void i915_gem_context_clear_recoverable(struct i915_gem_context *c
clear_bit(UCONTEXT_RECOVERABLE, &ctx->user_flags);
}
static inline bool i915_gem_context_is_persistent(const struct i915_gem_context *ctx)
{
return test_bit(UCONTEXT_PERSISTENCE, &ctx->user_flags);
}
static inline void i915_gem_context_set_persistence(struct i915_gem_context *ctx)
{
set_bit(UCONTEXT_PERSISTENCE, &ctx->user_flags);
}
static inline void i915_gem_context_clear_persistence(struct i915_gem_context *ctx)
{
clear_bit(UCONTEXT_PERSISTENCE, &ctx->user_flags);
}
static inline bool i915_gem_context_is_banned(const struct i915_gem_context *ctx)
{
return test_bit(CONTEXT_BANNED, &ctx->flags);

View File

@ -137,6 +137,7 @@ struct i915_gem_context {
#define UCONTEXT_NO_ERROR_CAPTURE 1
#define UCONTEXT_BANNABLE 2
#define UCONTEXT_RECOVERABLE 3
#define UCONTEXT_PERSISTENCE 4
/**
* @flags: small set of booleans

View File

@ -256,6 +256,7 @@ static const struct drm_i915_gem_object_ops i915_gem_object_dmabuf_ops = {
struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
struct dma_buf *dma_buf)
{
static struct lock_class_key lock_class;
struct dma_buf_attachment *attach;
struct drm_i915_gem_object *obj;
int ret;
@ -287,7 +288,7 @@ struct drm_gem_object *i915_gem_prime_import(struct drm_device *dev,
}
drm_gem_private_object_init(dev, &obj->base, dma_buf->size);
i915_gem_object_init(obj, &i915_gem_object_dmabuf_ops);
i915_gem_object_init(obj, &i915_gem_object_dmabuf_ops, &lock_class);
obj->base.import_attach = attach;
obj->base.resv = dma_buf->resv;

View File

@ -19,6 +19,7 @@
#include "gt/intel_engine_pool.h"
#include "gt/intel_gt.h"
#include "gt/intel_gt_pm.h"
#include "gt/intel_ring.h"
#include "i915_drv.h"
#include "i915_gem_clflush.h"

View File

@ -164,6 +164,7 @@ struct drm_i915_gem_object *
i915_gem_object_create_internal(struct drm_i915_private *i915,
phys_addr_t size)
{
static struct lock_class_key lock_class;
struct drm_i915_gem_object *obj;
unsigned int cache_level;
@ -178,7 +179,7 @@ i915_gem_object_create_internal(struct drm_i915_private *i915,
return ERR_PTR(-ENOMEM);
drm_gem_private_object_init(&i915->drm, &obj->base, size);
i915_gem_object_init(obj, &i915_gem_object_internal_ops);
i915_gem_object_init(obj, &i915_gem_object_internal_ops, &lock_class);
/*
* Mark the object as volatile, such that the pages are marked as

View File

@ -0,0 +1,99 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2019 Intel Corporation
*/
#include "intel_memory_region.h"
#include "gem/i915_gem_region.h"
#include "gem/i915_gem_lmem.h"
#include "i915_drv.h"
const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops = {
.flags = I915_GEM_OBJECT_HAS_IOMEM,
.get_pages = i915_gem_object_get_pages_buddy,
.put_pages = i915_gem_object_put_pages_buddy,
.release = i915_gem_object_release_memory_region,
};
/* XXX: Time to vfunc your life up? */
void __iomem *
i915_gem_object_lmem_io_map_page(struct drm_i915_gem_object *obj,
unsigned long n)
{
resource_size_t offset;
offset = i915_gem_object_get_dma_address(obj, n);
offset -= obj->mm.region->region.start;
return io_mapping_map_wc(&obj->mm.region->iomap, offset, PAGE_SIZE);
}
void __iomem *
i915_gem_object_lmem_io_map_page_atomic(struct drm_i915_gem_object *obj,
unsigned long n)
{
resource_size_t offset;
offset = i915_gem_object_get_dma_address(obj, n);
offset -= obj->mm.region->region.start;
return io_mapping_map_atomic_wc(&obj->mm.region->iomap, offset);
}
void __iomem *
i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
unsigned long n,
unsigned long size)
{
resource_size_t offset;
GEM_BUG_ON(!i915_gem_object_is_contiguous(obj));
offset = i915_gem_object_get_dma_address(obj, n);
offset -= obj->mm.region->region.start;
return io_mapping_map_wc(&obj->mm.region->iomap, offset, size);
}
bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
{
return obj->ops == &i915_gem_lmem_obj_ops;
}
struct drm_i915_gem_object *
i915_gem_object_create_lmem(struct drm_i915_private *i915,
resource_size_t size,
unsigned int flags)
{
return i915_gem_object_create_region(i915->mm.regions[INTEL_REGION_LMEM],
size, flags);
}
struct drm_i915_gem_object *
__i915_gem_lmem_object_create(struct intel_memory_region *mem,
resource_size_t size,
unsigned int flags)
{
static struct lock_class_key lock_class;
struct drm_i915_private *i915 = mem->i915;
struct drm_i915_gem_object *obj;
if (size > BIT(mem->mm.max_order) * mem->mm.chunk_size)
return ERR_PTR(-E2BIG);
obj = i915_gem_object_alloc();
if (!obj)
return ERR_PTR(-ENOMEM);
drm_gem_private_object_init(&i915->drm, &obj->base, size);
i915_gem_object_init(obj, &i915_gem_lmem_obj_ops, &lock_class);
obj->read_domains = I915_GEM_DOMAIN_WC | I915_GEM_DOMAIN_GTT;
i915_gem_object_set_cache_coherency(obj, I915_CACHE_NONE);
i915_gem_object_init_memory_region(obj, mem, flags);
return obj;
}

View File

@ -0,0 +1,37 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2019 Intel Corporation
*/
#ifndef __I915_GEM_LMEM_H
#define __I915_GEM_LMEM_H
#include <linux/types.h>
struct drm_i915_private;
struct drm_i915_gem_object;
struct intel_memory_region;
extern const struct drm_i915_gem_object_ops i915_gem_lmem_obj_ops;
void __iomem *i915_gem_object_lmem_io_map(struct drm_i915_gem_object *obj,
unsigned long n, unsigned long size);
void __iomem *i915_gem_object_lmem_io_map_page(struct drm_i915_gem_object *obj,
unsigned long n);
void __iomem *
i915_gem_object_lmem_io_map_page_atomic(struct drm_i915_gem_object *obj,
unsigned long n);
bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
struct drm_i915_gem_object *
i915_gem_object_create_lmem(struct drm_i915_private *i915,
resource_size_t size,
unsigned int flags);
struct drm_i915_gem_object *
__i915_gem_lmem_object_create(struct intel_memory_region *mem,
resource_size_t size,
unsigned int flags);
#endif /* !__I915_GEM_LMEM_H */

View File

@ -312,7 +312,7 @@ vm_fault_t i915_gem_fault(struct vm_fault *vmf)
list_add(&obj->userfault_link, &i915->ggtt.userfault_list);
mutex_unlock(&i915->ggtt.vm.mutex);
if (CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND)
if (IS_ACTIVE(CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND))
intel_wakeref_auto(&i915->ggtt.userfault_wakeref,
msecs_to_jiffies_timeout(CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND));

View File

@ -47,9 +47,10 @@ void i915_gem_object_free(struct drm_i915_gem_object *obj)
}
void i915_gem_object_init(struct drm_i915_gem_object *obj,
const struct drm_i915_gem_object_ops *ops)
const struct drm_i915_gem_object_ops *ops,
struct lock_class_key *key)
{
mutex_init(&obj->mm.lock);
__mutex_init(&obj->mm.lock, "obj->mm.lock", key);
spin_lock_init(&obj->vma.lock);
INIT_LIST_HEAD(&obj->vma.list);

View File

@ -23,7 +23,8 @@ struct drm_i915_gem_object *i915_gem_object_alloc(void);
void i915_gem_object_free(struct drm_i915_gem_object *obj);
void i915_gem_object_init(struct drm_i915_gem_object *obj,
const struct drm_i915_gem_object_ops *ops);
const struct drm_i915_gem_object_ops *ops,
struct lock_class_key *key);
struct drm_i915_gem_object *
i915_gem_object_create_shmem(struct drm_i915_private *i915,
resource_size_t size);
@ -461,6 +462,5 @@ int i915_gem_object_wait(struct drm_i915_gem_object *obj,
int i915_gem_object_wait_priority(struct drm_i915_gem_object *obj,
unsigned int flags,
const struct i915_sched_attr *attr);
#define I915_PRIORITY_DISPLAY I915_USER_PRIORITY(I915_PRIORITY_MAX)
#endif

View File

@ -8,6 +8,7 @@
#include "gt/intel_engine_pm.h"
#include "gt/intel_engine_pool.h"
#include "gt/intel_gt.h"
#include "gt/intel_ring.h"
#include "i915_gem_clflush.h"
#include "i915_gem_object_blt.h"
@ -16,7 +17,7 @@ struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
u32 value)
{
struct drm_i915_private *i915 = ce->vm->i915;
const u32 block_size = S16_MAX * PAGE_SIZE;
const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
struct intel_engine_pool_node *pool;
struct i915_vma *batch;
u64 offset;
@ -29,7 +30,7 @@ struct i915_vma *intel_emit_vma_fill_blt(struct intel_context *ce,
GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
intel_engine_pm_get(ce->engine);
count = div_u64(vma->size, block_size);
count = div_u64(round_up(vma->size, block_size), block_size);
size = (1 + 8 * count) * sizeof(u32);
size = round_up(size, PAGE_SIZE);
pool = intel_engine_get_pool(ce->engine, size);
@ -200,7 +201,7 @@ struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
struct i915_vma *dst)
{
struct drm_i915_private *i915 = ce->vm->i915;
const u32 block_size = S16_MAX * PAGE_SIZE;
const u32 block_size = SZ_8M; /* ~1ms at 8GiB/s preemption delay */
struct intel_engine_pool_node *pool;
struct i915_vma *batch;
u64 src_offset, dst_offset;
@ -213,7 +214,7 @@ struct i915_vma *intel_emit_vma_copy_blt(struct intel_context *ce,
GEM_BUG_ON(intel_engine_is_virtual(ce->engine));
intel_engine_pm_get(ce->engine);
count = div_u64(dst->size, block_size);
count = div_u64(round_up(dst->size, block_size), block_size);
size = (1 + 11 * count) * sizeof(u32);
size = round_up(size, PAGE_SIZE);
pool = intel_engine_get_pool(ce->engine, size);

View File

@ -31,10 +31,11 @@ struct i915_lut_handle {
struct drm_i915_gem_object_ops {
unsigned int flags;
#define I915_GEM_OBJECT_HAS_STRUCT_PAGE BIT(0)
#define I915_GEM_OBJECT_IS_SHRINKABLE BIT(1)
#define I915_GEM_OBJECT_IS_PROXY BIT(2)
#define I915_GEM_OBJECT_NO_GGTT BIT(3)
#define I915_GEM_OBJECT_ASYNC_CANCEL BIT(4)
#define I915_GEM_OBJECT_HAS_IOMEM BIT(1)
#define I915_GEM_OBJECT_IS_SHRINKABLE BIT(2)
#define I915_GEM_OBJECT_IS_PROXY BIT(3)
#define I915_GEM_OBJECT_NO_GGTT BIT(4)
#define I915_GEM_OBJECT_ASYNC_CANCEL BIT(5)
/* Interface between the GEM object and its backing storage.
* get_pages() is called once prior to the use of the associated set

View File

@ -7,6 +7,7 @@
#include "i915_drv.h"
#include "i915_gem_object.h"
#include "i915_scatterlist.h"
#include "i915_gem_lmem.h"
void __i915_gem_object_set_pages(struct drm_i915_gem_object *obj,
struct sg_table *pages,
@ -154,6 +155,16 @@ static void __i915_gem_object_reset_page_iter(struct drm_i915_gem_object *obj)
rcu_read_unlock();
}
static void unmap_object(struct drm_i915_gem_object *obj, void *ptr)
{
if (i915_gem_object_is_lmem(obj))
io_mapping_unmap((void __force __iomem *)ptr);
else if (is_vmalloc_addr(ptr))
vunmap(ptr);
else
kunmap(kmap_to_page(ptr));
}
struct sg_table *
__i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
{
@ -169,14 +180,7 @@ __i915_gem_object_unset_pages(struct drm_i915_gem_object *obj)
i915_gem_object_make_unshrinkable(obj);
if (obj->mm.mapping) {
void *ptr;
ptr = page_mask_bits(obj->mm.mapping);
if (is_vmalloc_addr(ptr))
vunmap(ptr);
else
kunmap(kmap_to_page(ptr));
unmap_object(obj, page_mask_bits(obj->mm.mapping));
obj->mm.mapping = NULL;
}
@ -231,7 +235,7 @@ unlock:
}
/* The 'mapping' part of i915_gem_object_pin_map() below */
static void *i915_gem_object_map(const struct drm_i915_gem_object *obj,
static void *i915_gem_object_map(struct drm_i915_gem_object *obj,
enum i915_map_type type)
{
unsigned long n_pages = obj->base.size >> PAGE_SHIFT;
@ -244,6 +248,16 @@ static void *i915_gem_object_map(const struct drm_i915_gem_object *obj,
pgprot_t pgprot;
void *addr;
if (i915_gem_object_is_lmem(obj)) {
void __iomem *io;
if (type != I915_MAP_WC)
return NULL;
io = i915_gem_object_lmem_io_map(obj, 0, obj->base.size);
return (void __force *)io;
}
/* A single page can always be kmapped */
if (n_pages == 1 && type == I915_MAP_WB)
return kmap(sg_page(sgt->sgl));
@ -285,11 +299,13 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
enum i915_map_type type)
{
enum i915_map_type has_type;
unsigned int flags;
bool pinned;
void *ptr;
int err;
if (unlikely(!i915_gem_object_has_struct_page(obj)))
flags = I915_GEM_OBJECT_HAS_STRUCT_PAGE | I915_GEM_OBJECT_HAS_IOMEM;
if (!i915_gem_object_type_has(obj, flags))
return ERR_PTR(-ENXIO);
err = mutex_lock_interruptible(&obj->mm.lock);
@ -321,10 +337,7 @@ void *i915_gem_object_pin_map(struct drm_i915_gem_object *obj,
goto err_unpin;
}
if (is_vmalloc_addr(ptr))
vunmap(ptr);
else
kunmap(kmap_to_page(ptr));
unmap_object(obj, ptr);
ptr = obj->mm.mapping = NULL;
}

View File

@ -11,25 +11,6 @@
#include "i915_drv.h"
static int pm_notifier(struct notifier_block *nb,
unsigned long action,
void *data)
{
struct drm_i915_private *i915 =
container_of(nb, typeof(*i915), gem.pm_notifier);
switch (action) {
case INTEL_GT_UNPARK:
break;
case INTEL_GT_PARK:
i915_vma_parked(i915);
break;
}
return NOTIFY_OK;
}
static bool switch_to_kernel_context_sync(struct intel_gt *gt)
{
bool result = !intel_gt_is_wedged(gt);
@ -56,11 +37,6 @@ static bool switch_to_kernel_context_sync(struct intel_gt *gt)
return result;
}
bool i915_gem_load_power_context(struct drm_i915_private *i915)
{
return switch_to_kernel_context_sync(&i915->gt);
}
static void user_forcewake(struct intel_gt *gt, bool suspend)
{
int count = atomic_read(&gt->user_wakeref);
@ -100,8 +76,6 @@ void i915_gem_suspend(struct drm_i915_private *i915)
intel_gt_suspend(&i915->gt);
intel_uc_suspend(&i915->gt.uc);
cancel_delayed_work_sync(&i915->gt.hangcheck.work);
i915_gem_drain_freed_objects(i915);
}
@ -190,7 +164,7 @@ void i915_gem_resume(struct drm_i915_private *i915)
intel_uc_resume(&i915->gt.uc);
/* Always reload a context for powersaving. */
if (!i915_gem_load_power_context(i915))
if (!switch_to_kernel_context_sync(&i915->gt))
goto err_wedged;
user_forcewake(&i915->gt, false);
@ -207,10 +181,3 @@ err_wedged:
}
goto out_unlock;
}
void i915_gem_init__pm(struct drm_i915_private *i915)
{
i915->gem.pm_notifier.notifier_call = pm_notifier;
blocking_notifier_chain_register(&i915->gt.pm_notifications,
&i915->gem.pm_notifier);
}

View File

@ -12,9 +12,6 @@
struct drm_i915_private;
struct work_struct;
void i915_gem_init__pm(struct drm_i915_private *i915);
bool i915_gem_load_power_context(struct drm_i915_private *i915);
void i915_gem_resume(struct drm_i915_private *i915);
void i915_gem_idle_work_handler(struct work_struct *work);

View File

@ -465,6 +465,7 @@ create_shmem(struct intel_memory_region *mem,
resource_size_t size,
unsigned int flags)
{
static struct lock_class_key lock_class;
struct drm_i915_private *i915 = mem->i915;
struct drm_i915_gem_object *obj;
struct address_space *mapping;
@ -491,7 +492,7 @@ create_shmem(struct intel_memory_region *mem,
mapping_set_gfp_mask(mapping, mask);
GEM_BUG_ON(!(mapping_gfp_mask(mapping) & __GFP_RECLAIM));
i915_gem_object_init(obj, &i915_gem_shmem_ops);
i915_gem_object_init(obj, &i915_gem_shmem_ops, &lock_class);
obj->write_domain = I915_GEM_DOMAIN_CPU;
obj->read_domains = I915_GEM_DOMAIN_CPU;

View File

@ -556,6 +556,7 @@ __i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
struct drm_mm_node *stolen,
struct intel_memory_region *mem)
{
static struct lock_class_key lock_class;
struct drm_i915_gem_object *obj;
unsigned int cache_level;
int err = -ENOMEM;
@ -565,7 +566,7 @@ __i915_gem_object_create_stolen(struct drm_i915_private *dev_priv,
goto err;
drm_gem_private_object_init(&dev_priv->drm, &obj->base, stolen->size);
i915_gem_object_init(obj, &i915_gem_object_stolen_ops);
i915_gem_object_init(obj, &i915_gem_object_stolen_ops, &lock_class);
obj->stolen = stolen;
obj->read_domains = I915_GEM_DOMAIN_CPU | I915_GEM_DOMAIN_GTT;

View File

@ -725,6 +725,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
void *data,
struct drm_file *file)
{
static struct lock_class_key lock_class;
struct drm_i915_private *dev_priv = to_i915(dev);
struct drm_i915_gem_userptr *args = data;
struct drm_i915_gem_object *obj;
@ -769,7 +770,7 @@ i915_gem_userptr_ioctl(struct drm_device *dev,
return -ENOMEM;
drm_gem_private_object_init(dev, &obj->base, args->user_size);
i915_gem_object_init(obj, &i915_gem_userptr_ops);
i915_gem_object_init(obj, &i915_gem_userptr_ops, &lock_class);
obj->read_domains = I915_GEM_DOMAIN_CPU;
obj->write_domain = I915_GEM_DOMAIN_CPU;
i915_gem_object_set_cache_coherency(obj, I915_CACHE_LLC);

View File

@ -96,6 +96,7 @@ huge_gem_object(struct drm_i915_private *i915,
phys_addr_t phys_size,
dma_addr_t dma_size)
{
static struct lock_class_key lock_class;
struct drm_i915_gem_object *obj;
unsigned int cache_level;
@ -111,7 +112,7 @@ huge_gem_object(struct drm_i915_private *i915,
return ERR_PTR(-ENOMEM);
drm_gem_private_object_init(&i915->drm, &obj->base, dma_size);
i915_gem_object_init(obj, &huge_ops);
i915_gem_object_init(obj, &huge_ops, &lock_class);
obj->read_domains = I915_GEM_DOMAIN_CPU;
obj->write_domain = I915_GEM_DOMAIN_CPU;

View File

@ -9,6 +9,7 @@
#include "i915_selftest.h"
#include "gem/i915_gem_region.h"
#include "gem/i915_gem_lmem.h"
#include "gem/i915_gem_pm.h"
#include "gt/intel_gt.h"
@ -149,6 +150,7 @@ huge_pages_object(struct drm_i915_private *i915,
u64 size,
unsigned int page_mask)
{
static struct lock_class_key lock_class;
struct drm_i915_gem_object *obj;
GEM_BUG_ON(!size);
@ -165,7 +167,7 @@ huge_pages_object(struct drm_i915_private *i915,
return ERR_PTR(-ENOMEM);
drm_gem_private_object_init(&i915->drm, &obj->base, size);
i915_gem_object_init(obj, &huge_page_ops);
i915_gem_object_init(obj, &huge_page_ops, &lock_class);
i915_gem_object_set_volatile(obj);
@ -295,6 +297,7 @@ static const struct drm_i915_gem_object_ops fake_ops_single = {
static struct drm_i915_gem_object *
fake_huge_pages_object(struct drm_i915_private *i915, u64 size, bool single)
{
static struct lock_class_key lock_class;
struct drm_i915_gem_object *obj;
GEM_BUG_ON(!size);
@ -313,9 +316,9 @@ fake_huge_pages_object(struct drm_i915_private *i915, u64 size, bool single)
drm_gem_private_object_init(&i915->drm, &obj->base, size);
if (single)
i915_gem_object_init(obj, &fake_ops_single);
i915_gem_object_init(obj, &fake_ops_single, &lock_class);
else
i915_gem_object_init(obj, &fake_ops);
i915_gem_object_init(obj, &fake_ops, &lock_class);
i915_gem_object_set_volatile(obj);
@ -981,7 +984,8 @@ static int gpu_write(struct intel_context *ce,
vma->size >> PAGE_SHIFT, val);
}
static int cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)
static int
__cpu_check_shmem(struct drm_i915_gem_object *obj, u32 dword, u32 val)
{
unsigned int needs_flush;
unsigned long n;
@ -1013,6 +1017,51 @@ static int cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)
return err;
}
static int __cpu_check_lmem(struct drm_i915_gem_object *obj, u32 dword, u32 val)
{
unsigned long n;
int err;
i915_gem_object_lock(obj);
err = i915_gem_object_set_to_wc_domain(obj, false);
i915_gem_object_unlock(obj);
if (err)
return err;
err = i915_gem_object_pin_pages(obj);
if (err)
return err;
for (n = 0; n < obj->base.size >> PAGE_SHIFT; ++n) {
u32 __iomem *base;
u32 read_val;
base = i915_gem_object_lmem_io_map_page_atomic(obj, n);
read_val = ioread32(base + dword);
io_mapping_unmap_atomic(base);
if (read_val != val) {
pr_err("n=%lu base[%u]=%u, val=%u\n",
n, dword, read_val, val);
err = -EINVAL;
break;
}
}
i915_gem_object_unpin_pages(obj);
return err;
}
static int cpu_check(struct drm_i915_gem_object *obj, u32 dword, u32 val)
{
if (i915_gem_object_has_struct_page(obj))
return __cpu_check_shmem(obj, dword, val);
else if (i915_gem_object_is_lmem(obj))
return __cpu_check_lmem(obj, dword, val);
return -ENODEV;
}
static int __igt_write_huge(struct intel_context *ce,
struct drm_i915_gem_object *obj,
u64 size, u64 offset,
@ -1268,131 +1317,235 @@ out_device:
return err;
}
static int igt_ppgtt_internal_huge(void *arg)
{
struct i915_gem_context *ctx = arg;
struct drm_i915_private *i915 = ctx->i915;
struct drm_i915_gem_object *obj;
static const unsigned int sizes[] = {
SZ_64K,
SZ_128K,
SZ_256K,
SZ_512K,
SZ_1M,
SZ_2M,
};
int i;
int err;
/*
* Sanity check that the HW uses huge pages correctly through internal
* -- ensure that our writes land in the right place.
*/
for (i = 0; i < ARRAY_SIZE(sizes); ++i) {
unsigned int size = sizes[i];
obj = i915_gem_object_create_internal(i915, size);
if (IS_ERR(obj))
return PTR_ERR(obj);
err = i915_gem_object_pin_pages(obj);
if (err)
goto out_put;
if (obj->mm.page_sizes.phys < I915_GTT_PAGE_SIZE_64K) {
pr_info("internal unable to allocate huge-page(s) with size=%u\n",
size);
goto out_unpin;
}
err = igt_write_huge(ctx, obj);
if (err) {
pr_err("internal write-huge failed with size=%u\n",
size);
goto out_unpin;
}
i915_gem_object_unpin_pages(obj);
__i915_gem_object_put_pages(obj, I915_MM_NORMAL);
i915_gem_object_put(obj);
}
return 0;
out_unpin:
i915_gem_object_unpin_pages(obj);
out_put:
i915_gem_object_put(obj);
return err;
}
typedef struct drm_i915_gem_object *
(*igt_create_fn)(struct drm_i915_private *i915, u32 size, u32 flags);
static inline bool igt_can_allocate_thp(struct drm_i915_private *i915)
{
return i915->mm.gemfs && has_transparent_hugepage();
}
static int igt_ppgtt_gemfs_huge(void *arg)
static struct drm_i915_gem_object *
igt_create_shmem(struct drm_i915_private *i915, u32 size, u32 flags)
{
if (!igt_can_allocate_thp(i915)) {
pr_info("%s missing THP support, skipping\n", __func__);
return ERR_PTR(-ENODEV);
}
return i915_gem_object_create_shmem(i915, size);
}
static struct drm_i915_gem_object *
igt_create_internal(struct drm_i915_private *i915, u32 size, u32 flags)
{
return i915_gem_object_create_internal(i915, size);
}
static struct drm_i915_gem_object *
igt_create_system(struct drm_i915_private *i915, u32 size, u32 flags)
{
return huge_pages_object(i915, size, size);
}
static struct drm_i915_gem_object *
igt_create_local(struct drm_i915_private *i915, u32 size, u32 flags)
{
return i915_gem_object_create_lmem(i915, size, flags);
}
static u32 igt_random_size(struct rnd_state *prng,
u32 min_page_size,
u32 max_page_size)
{
u64 mask;
u32 size;
GEM_BUG_ON(!is_power_of_2(min_page_size));
GEM_BUG_ON(!is_power_of_2(max_page_size));
GEM_BUG_ON(min_page_size < PAGE_SIZE);
GEM_BUG_ON(min_page_size > max_page_size);
mask = ((max_page_size << 1ULL) - 1) & PAGE_MASK;
size = prandom_u32_state(prng) & mask;
if (size < min_page_size)
size |= min_page_size;
return size;
}
static int igt_ppgtt_smoke_huge(void *arg)
{
struct i915_gem_context *ctx = arg;
struct drm_i915_private *i915 = ctx->i915;
struct drm_i915_gem_object *obj;
static const unsigned int sizes[] = {
SZ_2M,
SZ_4M,
SZ_8M,
SZ_16M,
SZ_32M,
I915_RND_STATE(prng);
struct {
igt_create_fn fn;
u32 min;
u32 max;
} backends[] = {
{ igt_create_internal, SZ_64K, SZ_2M, },
{ igt_create_shmem, SZ_64K, SZ_32M, },
{ igt_create_local, SZ_64K, SZ_1G, },
};
int i;
int err;
int i;
/*
* Sanity check that the HW uses huge pages correctly through gemfs --
* ensure that our writes land in the right place.
* Sanity check that the HW uses huge pages correctly through our
* various backends -- ensure that our writes land in the right place.
*/
if (!igt_can_allocate_thp(i915)) {
pr_info("missing THP support, skipping\n");
return 0;
}
for (i = 0; i < ARRAY_SIZE(backends); ++i) {
u32 min = backends[i].min;
u32 max = backends[i].max;
u32 size = max;
try_again:
size = igt_random_size(&prng, min, rounddown_pow_of_two(size));
for (i = 0; i < ARRAY_SIZE(sizes); ++i) {
unsigned int size = sizes[i];
obj = backends[i].fn(i915, size, 0);
if (IS_ERR(obj)) {
err = PTR_ERR(obj);
if (err == -E2BIG) {
size >>= 1;
goto try_again;
} else if (err == -ENODEV) {
err = 0;
continue;
}
obj = i915_gem_object_create_shmem(i915, size);
if (IS_ERR(obj))
return PTR_ERR(obj);
return err;
}
err = i915_gem_object_pin_pages(obj);
if (err)
if (err) {
if (err == -ENXIO) {
i915_gem_object_put(obj);
size >>= 1;
goto try_again;
}
goto out_put;
}
if (obj->mm.page_sizes.phys < I915_GTT_PAGE_SIZE_2M) {
pr_info("finishing test early, gemfs unable to allocate huge-page(s) with size=%u\n",
size);
if (obj->mm.page_sizes.phys < min) {
pr_info("%s unable to allocate huge-page(s) with size=%u, i=%d\n",
__func__, size, i);
err = -ENOMEM;
goto out_unpin;
}
err = igt_write_huge(ctx, obj);
if (err) {
pr_err("gemfs write-huge failed with size=%u\n",
size);
goto out_unpin;
pr_err("%s write-huge failed with size=%u, i=%d\n",
__func__, size, i);
}
out_unpin:
i915_gem_object_unpin_pages(obj);
__i915_gem_object_put_pages(obj, I915_MM_NORMAL);
out_put:
i915_gem_object_put(obj);
if (err == -ENOMEM || err == -ENXIO)
err = 0;
if (err)
break;
cond_resched();
}
return 0;
return err;
}
out_unpin:
i915_gem_object_unpin_pages(obj);
out_put:
i915_gem_object_put(obj);
static int igt_ppgtt_sanity_check(void *arg)
{
struct i915_gem_context *ctx = arg;
struct drm_i915_private *i915 = ctx->i915;
unsigned int supported = INTEL_INFO(i915)->page_sizes;
struct {
igt_create_fn fn;
unsigned int flags;
} backends[] = {
{ igt_create_system, 0, },
{ igt_create_local, I915_BO_ALLOC_CONTIGUOUS, },
};
struct {
u32 size;
u32 pages;
} combos[] = {
{ SZ_64K, SZ_64K },
{ SZ_2M, SZ_2M },
{ SZ_2M, SZ_64K },
{ SZ_2M - SZ_64K, SZ_64K },
{ SZ_2M - SZ_4K, SZ_64K | SZ_4K },
{ SZ_2M + SZ_4K, SZ_64K | SZ_4K },
{ SZ_2M + SZ_4K, SZ_2M | SZ_4K },
{ SZ_2M + SZ_64K, SZ_2M | SZ_64K },
};
int i, j;
int err;
if (supported == I915_GTT_PAGE_SIZE_4K)
return 0;
/*
* Sanity check that the HW behaves with a limited set of combinations.
* We already have a bunch of randomised testing, which should give us
* a decent amount of variation between runs, however we should keep
* this to limit the chances of introducing a temporary regression, by
* testing the most obvious cases that might make something blow up.
*/
for (i = 0; i < ARRAY_SIZE(backends); ++i) {
for (j = 0; j < ARRAY_SIZE(combos); ++j) {
struct drm_i915_gem_object *obj;
u32 size = combos[j].size;
u32 pages = combos[j].pages;
obj = backends[i].fn(i915, size, backends[i].flags);
if (IS_ERR(obj)) {
err = PTR_ERR(obj);
if (err == -ENODEV) {
pr_info("Device lacks local memory, skipping\n");
err = 0;
break;
}
return err;
}
err = i915_gem_object_pin_pages(obj);
if (err) {
i915_gem_object_put(obj);
goto out;
}
GEM_BUG_ON(pages > obj->base.size);
pages = pages & supported;
if (pages)
obj->mm.page_sizes.sg = pages;
err = igt_write_huge(ctx, obj);
i915_gem_object_unpin_pages(obj);
__i915_gem_object_put_pages(obj, I915_MM_NORMAL);
i915_gem_object_put(obj);
if (err) {
pr_err("%s write-huge failed with size=%u pages=%u i=%d, j=%d\n",
__func__, size, pages, i, j);
goto out;
}
}
cond_resched();
}
out:
if (err == -ENOMEM)
err = 0;
return err;
}
@ -1756,8 +1909,8 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *i915)
SUBTEST(igt_ppgtt_pin_update),
SUBTEST(igt_tmpfs_fallback),
SUBTEST(igt_ppgtt_exhaust_huge),
SUBTEST(igt_ppgtt_gemfs_huge),
SUBTEST(igt_ppgtt_internal_huge),
SUBTEST(igt_ppgtt_smoke_huge),
SUBTEST(igt_ppgtt_sanity_check),
};
struct drm_file *file;
struct i915_gem_context *ctx;

View File

@ -5,6 +5,7 @@
#include "i915_selftest.h"
#include "gt/intel_engine_user.h"
#include "gt/intel_gt.h"
#include "selftests/igt_flush_test.h"
@ -12,10 +13,9 @@
#include "huge_gem_object.h"
#include "mock_context.h"
static int igt_client_fill(void *arg)
static int __igt_client_fill(struct intel_engine_cs *engine)
{
struct drm_i915_private *i915 = arg;
struct intel_context *ce = i915->engine[BCS0]->kernel_context;
struct intel_context *ce = engine->kernel_context;
struct drm_i915_gem_object *obj;
struct rnd_state prng;
IGT_TIMEOUT(end);
@ -37,7 +37,7 @@ static int igt_client_fill(void *arg)
pr_debug("%s with phys_sz= %x, sz=%x, val=%x\n", __func__,
phys_sz, sz, val);
obj = huge_gem_object(i915, phys_sz, sz);
obj = huge_gem_object(engine->i915, phys_sz, sz);
if (IS_ERR(obj)) {
err = PTR_ERR(obj);
goto err_flush;
@ -103,6 +103,28 @@ err_flush:
return err;
}
static int igt_client_fill(void *arg)
{
int inst = 0;
do {
struct intel_engine_cs *engine;
int err;
engine = intel_engine_lookup_user(arg,
I915_ENGINE_CLASS_COPY,
inst++);
if (!engine)
return 0;
err = __igt_client_fill(engine);
if (err == -ENOMEM)
err = 0;
if (err)
return err;
} while (1);
}
int i915_gem_client_blt_live_selftests(struct drm_i915_private *i915)
{
static const struct i915_subtest tests[] = {

View File

@ -8,13 +8,17 @@
#include "gt/intel_gt.h"
#include "gt/intel_gt_pm.h"
#include "gt/intel_ring.h"
#include "i915_selftest.h"
#include "selftests/i915_random.h"
static int cpu_set(struct drm_i915_gem_object *obj,
unsigned long offset,
u32 v)
struct context {
struct drm_i915_gem_object *obj;
struct intel_engine_cs *engine;
};
static int cpu_set(struct context *ctx, unsigned long offset, u32 v)
{
unsigned int needs_clflush;
struct page *page;
@ -22,11 +26,11 @@ static int cpu_set(struct drm_i915_gem_object *obj,
u32 *cpu;
int err;
err = i915_gem_object_prepare_write(obj, &needs_clflush);
err = i915_gem_object_prepare_write(ctx->obj, &needs_clflush);
if (err)
return err;
page = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
page = i915_gem_object_get_page(ctx->obj, offset >> PAGE_SHIFT);
map = kmap_atomic(page);
cpu = map + offset_in_page(offset);
@ -39,14 +43,12 @@ static int cpu_set(struct drm_i915_gem_object *obj,
drm_clflush_virt_range(cpu, sizeof(*cpu));
kunmap_atomic(map);
i915_gem_object_finish_access(obj);
i915_gem_object_finish_access(ctx->obj);
return 0;
}
static int cpu_get(struct drm_i915_gem_object *obj,
unsigned long offset,
u32 *v)
static int cpu_get(struct context *ctx, unsigned long offset, u32 *v)
{
unsigned int needs_clflush;
struct page *page;
@ -54,11 +56,11 @@ static int cpu_get(struct drm_i915_gem_object *obj,
u32 *cpu;
int err;
err = i915_gem_object_prepare_read(obj, &needs_clflush);
err = i915_gem_object_prepare_read(ctx->obj, &needs_clflush);
if (err)
return err;
page = i915_gem_object_get_page(obj, offset >> PAGE_SHIFT);
page = i915_gem_object_get_page(ctx->obj, offset >> PAGE_SHIFT);
map = kmap_atomic(page);
cpu = map + offset_in_page(offset);
@ -68,26 +70,24 @@ static int cpu_get(struct drm_i915_gem_object *obj,
*v = *cpu;
kunmap_atomic(map);
i915_gem_object_finish_access(obj);
i915_gem_object_finish_access(ctx->obj);
return 0;
}
static int gtt_set(struct drm_i915_gem_object *obj,
unsigned long offset,
u32 v)
static int gtt_set(struct context *ctx, unsigned long offset, u32 v)
{
struct i915_vma *vma;
u32 __iomem *map;
int err = 0;
i915_gem_object_lock(obj);
err = i915_gem_object_set_to_gtt_domain(obj, true);
i915_gem_object_unlock(obj);
i915_gem_object_lock(ctx->obj);
err = i915_gem_object_set_to_gtt_domain(ctx->obj, true);
i915_gem_object_unlock(ctx->obj);
if (err)
return err;
vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, PIN_MAPPABLE);
vma = i915_gem_object_ggtt_pin(ctx->obj, NULL, 0, 0, PIN_MAPPABLE);
if (IS_ERR(vma))
return PTR_ERR(vma);
@ -108,21 +108,19 @@ out_rpm:
return err;
}
static int gtt_get(struct drm_i915_gem_object *obj,
unsigned long offset,
u32 *v)
static int gtt_get(struct context *ctx, unsigned long offset, u32 *v)
{
struct i915_vma *vma;
u32 __iomem *map;
int err = 0;
i915_gem_object_lock(obj);
err = i915_gem_object_set_to_gtt_domain(obj, false);
i915_gem_object_unlock(obj);
i915_gem_object_lock(ctx->obj);
err = i915_gem_object_set_to_gtt_domain(ctx->obj, false);
i915_gem_object_unlock(ctx->obj);
if (err)
return err;
vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, PIN_MAPPABLE);
vma = i915_gem_object_ggtt_pin(ctx->obj, NULL, 0, 0, PIN_MAPPABLE);
if (IS_ERR(vma))
return PTR_ERR(vma);
@ -143,73 +141,66 @@ out_rpm:
return err;
}
static int wc_set(struct drm_i915_gem_object *obj,
unsigned long offset,
u32 v)
static int wc_set(struct context *ctx, unsigned long offset, u32 v)
{
u32 *map;
int err;
i915_gem_object_lock(obj);
err = i915_gem_object_set_to_wc_domain(obj, true);
i915_gem_object_unlock(obj);
i915_gem_object_lock(ctx->obj);
err = i915_gem_object_set_to_wc_domain(ctx->obj, true);
i915_gem_object_unlock(ctx->obj);
if (err)
return err;
map = i915_gem_object_pin_map(obj, I915_MAP_WC);
map = i915_gem_object_pin_map(ctx->obj, I915_MAP_WC);
if (IS_ERR(map))
return PTR_ERR(map);
map[offset / sizeof(*map)] = v;
i915_gem_object_unpin_map(obj);
i915_gem_object_unpin_map(ctx->obj);
return 0;
}
static int wc_get(struct drm_i915_gem_object *obj,
unsigned long offset,
u32 *v)
static int wc_get(struct context *ctx, unsigned long offset, u32 *v)
{
u32 *map;
int err;
i915_gem_object_lock(obj);
err = i915_gem_object_set_to_wc_domain(obj, false);
i915_gem_object_unlock(obj);
i915_gem_object_lock(ctx->obj);
err = i915_gem_object_set_to_wc_domain(ctx->obj, false);
i915_gem_object_unlock(ctx->obj);
if (err)
return err;
map = i915_gem_object_pin_map(obj, I915_MAP_WC);
map = i915_gem_object_pin_map(ctx->obj, I915_MAP_WC);
if (IS_ERR(map))
return PTR_ERR(map);
*v = map[offset / sizeof(*map)];
i915_gem_object_unpin_map(obj);
i915_gem_object_unpin_map(ctx->obj);
return 0;
}
static int gpu_set(struct drm_i915_gem_object *obj,
unsigned long offset,
u32 v)
static int gpu_set(struct context *ctx, unsigned long offset, u32 v)
{
struct drm_i915_private *i915 = to_i915(obj->base.dev);
struct i915_request *rq;
struct i915_vma *vma;
u32 *cs;
int err;
i915_gem_object_lock(obj);
err = i915_gem_object_set_to_gtt_domain(obj, true);
i915_gem_object_unlock(obj);
i915_gem_object_lock(ctx->obj);
err = i915_gem_object_set_to_gtt_domain(ctx->obj, true);
i915_gem_object_unlock(ctx->obj);
if (err)
return err;
vma = i915_gem_object_ggtt_pin(obj, NULL, 0, 0, 0);
vma = i915_gem_object_ggtt_pin(ctx->obj, NULL, 0, 0, 0);
if (IS_ERR(vma))
return PTR_ERR(vma);
rq = i915_request_create(i915->engine[RCS0]->kernel_context);
rq = i915_request_create(ctx->engine->kernel_context);
if (IS_ERR(rq)) {
i915_vma_unpin(vma);
return PTR_ERR(rq);
@ -222,12 +213,12 @@ static int gpu_set(struct drm_i915_gem_object *obj,
return PTR_ERR(cs);
}
if (INTEL_GEN(i915) >= 8) {
if (INTEL_GEN(ctx->engine->i915) >= 8) {
*cs++ = MI_STORE_DWORD_IMM_GEN4 | 1 << 22;
*cs++ = lower_32_bits(i915_ggtt_offset(vma) + offset);
*cs++ = upper_32_bits(i915_ggtt_offset(vma) + offset);
*cs++ = v;
} else if (INTEL_GEN(i915) >= 4) {
} else if (INTEL_GEN(ctx->engine->i915) >= 4) {
*cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
*cs++ = 0;
*cs++ = i915_ggtt_offset(vma) + offset;
@ -252,32 +243,34 @@ static int gpu_set(struct drm_i915_gem_object *obj,
return err;
}
static bool always_valid(struct drm_i915_private *i915)
static bool always_valid(struct context *ctx)
{
return true;
}
static bool needs_fence_registers(struct drm_i915_private *i915)
static bool needs_fence_registers(struct context *ctx)
{
return !intel_gt_is_wedged(&i915->gt);
struct intel_gt *gt = ctx->engine->gt;
if (intel_gt_is_wedged(gt))
return false;
return gt->ggtt->num_fences;
}
static bool needs_mi_store_dword(struct drm_i915_private *i915)
static bool needs_mi_store_dword(struct context *ctx)
{
if (intel_gt_is_wedged(&i915->gt))
if (intel_gt_is_wedged(ctx->engine->gt))
return false;
if (!HAS_ENGINE(i915, RCS0))
return false;
return intel_engine_can_store_dword(i915->engine[RCS0]);
return intel_engine_can_store_dword(ctx->engine);
}
static const struct igt_coherency_mode {
const char *name;
int (*set)(struct drm_i915_gem_object *, unsigned long offset, u32 v);
int (*get)(struct drm_i915_gem_object *, unsigned long offset, u32 *v);
bool (*valid)(struct drm_i915_private *i915);
int (*set)(struct context *ctx, unsigned long offset, u32 v);
int (*get)(struct context *ctx, unsigned long offset, u32 *v);
bool (*valid)(struct context *ctx);
} igt_coherency_mode[] = {
{ "cpu", cpu_set, cpu_get, always_valid },
{ "gtt", gtt_set, gtt_get, needs_fence_registers },
@ -286,18 +279,37 @@ static const struct igt_coherency_mode {
{ },
};
static struct intel_engine_cs *
random_engine(struct drm_i915_private *i915, struct rnd_state *prng)
{
struct intel_engine_cs *engine;
unsigned int count;
count = 0;
for_each_uabi_engine(engine, i915)
count++;
count = i915_prandom_u32_max_state(count, prng);
for_each_uabi_engine(engine, i915)
if (count-- == 0)
return engine;
return NULL;
}
static int igt_gem_coherency(void *arg)
{
const unsigned int ncachelines = PAGE_SIZE/64;
I915_RND_STATE(prng);
struct drm_i915_private *i915 = arg;
const struct igt_coherency_mode *read, *write, *over;
struct drm_i915_gem_object *obj;
unsigned long count, n;
u32 *offsets, *values;
I915_RND_STATE(prng);
struct context ctx;
int err = 0;
/* We repeatedly write, overwrite and read from a sequence of
/*
* We repeatedly write, overwrite and read from a sequence of
* cachelines in order to try and detect incoherency (unflushed writes
* from either the CPU or GPU). Each setter/getter uses our cache
* domain API which should prevent incoherency.
@ -311,31 +323,35 @@ static int igt_gem_coherency(void *arg)
values = offsets + ncachelines;
ctx.engine = random_engine(i915, &prng);
GEM_BUG_ON(!ctx.engine);
pr_info("%s: using %s\n", __func__, ctx.engine->name);
for (over = igt_coherency_mode; over->name; over++) {
if (!over->set)
continue;
if (!over->valid(i915))
if (!over->valid(&ctx))
continue;
for (write = igt_coherency_mode; write->name; write++) {
if (!write->set)
continue;
if (!write->valid(i915))
if (!write->valid(&ctx))
continue;
for (read = igt_coherency_mode; read->name; read++) {
if (!read->get)
continue;
if (!read->valid(i915))
if (!read->valid(&ctx))
continue;
for_each_prime_number_from(count, 1, ncachelines) {
obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
if (IS_ERR(obj)) {
err = PTR_ERR(obj);
ctx.obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
if (IS_ERR(ctx.obj)) {
err = PTR_ERR(ctx.obj);
goto free;
}
@ -344,7 +360,7 @@ static int igt_gem_coherency(void *arg)
values[n] = prandom_u32_state(&prng);
for (n = 0; n < count; n++) {
err = over->set(obj, offsets[n], ~values[n]);
err = over->set(&ctx, offsets[n], ~values[n]);
if (err) {
pr_err("Failed to set stale value[%ld/%ld] in object using %s, err=%d\n",
n, count, over->name, err);
@ -353,7 +369,7 @@ static int igt_gem_coherency(void *arg)
}
for (n = 0; n < count; n++) {
err = write->set(obj, offsets[n], values[n]);
err = write->set(&ctx, offsets[n], values[n]);
if (err) {
pr_err("Failed to set value[%ld/%ld] in object using %s, err=%d\n",
n, count, write->name, err);
@ -364,7 +380,7 @@ static int igt_gem_coherency(void *arg)
for (n = 0; n < count; n++) {
u32 found;
err = read->get(obj, offsets[n], &found);
err = read->get(&ctx, offsets[n], &found);
if (err) {
pr_err("Failed to get value[%ld/%ld] in object using %s, err=%d\n",
n, count, read->name, err);
@ -382,7 +398,7 @@ static int igt_gem_coherency(void *arg)
}
}
i915_gem_object_put(obj);
i915_gem_object_put(ctx.obj);
}
}
}
@ -392,7 +408,7 @@ free:
return err;
put_object:
i915_gem_object_put(obj);
i915_gem_object_put(ctx.obj);
goto free;
}

View File

@ -32,7 +32,6 @@ static int live_nop_switch(void *arg)
struct drm_i915_private *i915 = arg;
struct intel_engine_cs *engine;
struct i915_gem_context **ctx;
enum intel_engine_id id;
struct igt_live_test t;
struct drm_file *file;
unsigned long n;
@ -67,7 +66,7 @@ static int live_nop_switch(void *arg)
}
}
for_each_engine(engine, i915, id) {
for_each_uabi_engine(engine, i915) {
struct i915_request *rq;
unsigned long end_time, prime;
ktime_t times[2] = {};
@ -170,18 +169,24 @@ static int __live_parallel_switch1(void *data)
struct i915_request *rq = NULL;
int err, n;
for (n = 0; n < ARRAY_SIZE(arg->ce); n++) {
i915_request_put(rq);
err = 0;
for (n = 0; !err && n < ARRAY_SIZE(arg->ce); n++) {
struct i915_request *prev = rq;
rq = i915_request_create(arg->ce[n]);
if (IS_ERR(rq))
if (IS_ERR(rq)) {
i915_request_put(prev);
return PTR_ERR(rq);
}
i915_request_get(rq);
if (prev) {
err = i915_request_await_dma_fence(rq, &prev->fence);
i915_request_put(prev);
}
i915_request_add(rq);
}
err = 0;
if (i915_request_wait(rq, 0, HZ / 5) < 0)
err = -ETIME;
i915_request_put(rq);
@ -198,6 +203,7 @@ static int __live_parallel_switch1(void *data)
static int __live_parallel_switchN(void *data)
{
struct parallel_switch *arg = data;
struct i915_request *rq = NULL;
IGT_TIMEOUT(end_time);
unsigned long count;
int n;
@ -205,17 +211,31 @@ static int __live_parallel_switchN(void *data)
count = 0;
do {
for (n = 0; n < ARRAY_SIZE(arg->ce); n++) {
struct i915_request *rq;
struct i915_request *prev = rq;
int err = 0;
rq = i915_request_create(arg->ce[n]);
if (IS_ERR(rq))
if (IS_ERR(rq)) {
i915_request_put(prev);
return PTR_ERR(rq);
}
i915_request_get(rq);
if (prev) {
err = i915_request_await_dma_fence(rq, &prev->fence);
i915_request_put(prev);
}
i915_request_add(rq);
if (err) {
i915_request_put(rq);
return err;
}
}
count++;
} while (!__igt_timeout(end_time, NULL));
i915_request_put(rq);
pr_info("%s: %lu switches (many)\n", arg->ce[0]->engine->name, count);
return 0;
@ -325,6 +345,8 @@ static int live_parallel_switch(void *arg)
get_task_struct(data[n].tsk);
}
yield(); /* start all threads before we kthread_stop() */
for (n = 0; n < count; n++) {
int status;
@ -583,7 +605,6 @@ static int igt_ctx_exec(void *arg)
{
struct drm_i915_private *i915 = arg;
struct intel_engine_cs *engine;
enum intel_engine_id id;
int err = -ENODEV;
/*
@ -595,7 +616,7 @@ static int igt_ctx_exec(void *arg)
if (!DRIVER_CAPS(i915)->has_logical_contexts)
return 0;
for_each_engine(engine, i915, id) {
for_each_uabi_engine(engine, i915) {
struct drm_i915_gem_object *obj = NULL;
unsigned long ncontexts, ndwords, dw;
struct i915_request *tq[5] = {};
@ -711,7 +732,6 @@ static int igt_shared_ctx_exec(void *arg)
struct i915_request *tq[5] = {};
struct i915_gem_context *parent;
struct intel_engine_cs *engine;
enum intel_engine_id id;
struct igt_live_test t;
struct drm_file *file;
int err = 0;
@ -743,7 +763,7 @@ static int igt_shared_ctx_exec(void *arg)
if (err)
goto out_file;
for_each_engine(engine, i915, id) {
for_each_uabi_engine(engine, i915) {
unsigned long ncontexts, ndwords, dw;
struct drm_i915_gem_object *obj = NULL;
IGT_TIMEOUT(end_time);
@ -1168,93 +1188,90 @@ __igt_ctx_sseu(struct drm_i915_private *i915,
const char *name,
unsigned int flags)
{
struct intel_engine_cs *engine = i915->engine[RCS0];
struct drm_i915_gem_object *obj;
struct i915_gem_context *ctx;
struct intel_context *ce;
struct intel_sseu pg_sseu;
struct drm_file *file;
int ret;
int inst = 0;
int ret = 0;
if (INTEL_GEN(i915) < 9 || !engine)
if (INTEL_GEN(i915) < 9 || !RUNTIME_INFO(i915)->sseu.has_slice_pg)
return 0;
if (!RUNTIME_INFO(i915)->sseu.has_slice_pg)
return 0;
if (hweight32(engine->sseu.slice_mask) < 2)
return 0;
/*
* Gen11 VME friendly power-gated configuration with half enabled
* sub-slices.
*/
pg_sseu = engine->sseu;
pg_sseu.slice_mask = 1;
pg_sseu.subslice_mask =
~(~0 << (hweight32(engine->sseu.subslice_mask) / 2));
pr_info("SSEU subtest '%s', flags=%x, def_slices=%u, pg_slices=%u\n",
name, flags, hweight32(engine->sseu.slice_mask),
hweight32(pg_sseu.slice_mask));
file = mock_file(i915);
if (IS_ERR(file))
return PTR_ERR(file);
if (flags & TEST_RESET)
igt_global_reset_lock(&i915->gt);
ctx = live_context(i915, file);
if (IS_ERR(ctx)) {
ret = PTR_ERR(ctx);
goto out_unlock;
}
i915_gem_context_clear_bannable(ctx); /* to reset and beyond! */
obj = i915_gem_object_create_internal(i915, PAGE_SIZE);
if (IS_ERR(obj)) {
ret = PTR_ERR(obj);
goto out_unlock;
}
ce = i915_gem_context_get_engine(ctx, RCS0);
if (IS_ERR(ce)) {
ret = PTR_ERR(ce);
goto out_put;
}
do {
struct intel_engine_cs *engine;
struct intel_context *ce;
struct intel_sseu pg_sseu;
ret = intel_context_pin(ce);
if (ret)
goto out_context;
engine = intel_engine_lookup_user(i915,
I915_ENGINE_CLASS_RENDER,
inst++);
if (!engine)
break;
/* First set the default mask. */
ret = __sseu_test(name, flags, ce, obj, engine->sseu);
if (ret)
goto out_fail;
if (hweight32(engine->sseu.slice_mask) < 2)
continue;
/* Then set a power-gated configuration. */
ret = __sseu_test(name, flags, ce, obj, pg_sseu);
if (ret)
goto out_fail;
/*
* Gen11 VME friendly power-gated configuration with
* half enabled sub-slices.
*/
pg_sseu = engine->sseu;
pg_sseu.slice_mask = 1;
pg_sseu.subslice_mask =
~(~0 << (hweight32(engine->sseu.subslice_mask) / 2));
/* Back to defaults. */
ret = __sseu_test(name, flags, ce, obj, engine->sseu);
if (ret)
goto out_fail;
pr_info("%s: SSEU subtest '%s', flags=%x, def_slices=%u, pg_slices=%u\n",
engine->name, name, flags,
hweight32(engine->sseu.slice_mask),
hweight32(pg_sseu.slice_mask));
/* One last power-gated configuration for the road. */
ret = __sseu_test(name, flags, ce, obj, pg_sseu);
if (ret)
goto out_fail;
ce = intel_context_create(engine->kernel_context->gem_context,
engine);
if (IS_ERR(ce)) {
ret = PTR_ERR(ce);
goto out_put;
}
ret = intel_context_pin(ce);
if (ret)
goto out_ce;
/* First set the default mask. */
ret = __sseu_test(name, flags, ce, obj, engine->sseu);
if (ret)
goto out_unpin;
/* Then set a power-gated configuration. */
ret = __sseu_test(name, flags, ce, obj, pg_sseu);
if (ret)
goto out_unpin;
/* Back to defaults. */
ret = __sseu_test(name, flags, ce, obj, engine->sseu);
if (ret)
goto out_unpin;
/* One last power-gated configuration for the road. */
ret = __sseu_test(name, flags, ce, obj, pg_sseu);
if (ret)
goto out_unpin;
out_unpin:
intel_context_unpin(ce);
out_ce:
intel_context_put(ce);
} while (!ret);
out_fail:
if (igt_flush_test(i915))
ret = -EIO;
intel_context_unpin(ce);
out_context:
intel_context_put(ce);
out_put:
i915_gem_object_put(obj);
@ -1262,8 +1279,6 @@ out_unlock:
if (flags & TEST_RESET)
igt_global_reset_unlock(&i915->gt);
mock_file_free(i915, file);
if (ret)
pr_err("%s: Failed with %d!\n", name, ret);
@ -1651,7 +1666,6 @@ static int igt_vm_isolation(void *arg)
struct drm_file *file;
I915_RND_STATE(prng);
unsigned long count;
unsigned int id;
u64 vm_total;
int err;
@ -1692,7 +1706,7 @@ static int igt_vm_isolation(void *arg)
vm_total -= I915_GTT_PAGE_SIZE;
count = 0;
for_each_engine(engine, i915, id) {
for_each_uabi_engine(engine, i915) {
IGT_TIMEOUT(end_time);
unsigned long this = 0;

View File

@ -301,6 +301,9 @@ static int igt_partial_tiling(void *arg)
int tiling;
int err;
if (!i915_ggtt_has_aperture(&i915->ggtt))
return 0;
/* We want to check the page mapping and fencing of a large object
* mmapped through the GTT. The object we create is larger than can
* possibly be mmaped as a whole, and so we must use partial GGTT vma.
@ -431,6 +434,9 @@ static int igt_smoke_tiling(void *arg)
IGT_TIMEOUT(end);
int err;
if (!i915_ggtt_has_aperture(&i915->ggtt))
return 0;
/*
* igt_partial_tiling() does an exhastive check of partial tiling
* chunking, but will undoubtably run out of time. Here, we do a
@ -515,20 +521,19 @@ static int make_obj_busy(struct drm_i915_gem_object *obj)
{
struct drm_i915_private *i915 = to_i915(obj->base.dev);
struct intel_engine_cs *engine;
enum intel_engine_id id;
struct i915_vma *vma;
int err;
vma = i915_vma_instance(obj, &i915->ggtt.vm, NULL);
if (IS_ERR(vma))
return PTR_ERR(vma);
err = i915_vma_pin(vma, 0, 0, PIN_USER);
if (err)
return err;
for_each_engine(engine, i915, id) {
for_each_uabi_engine(engine, i915) {
struct i915_request *rq;
struct i915_vma *vma;
int err;
vma = i915_vma_instance(obj, &engine->gt->ggtt->vm, NULL);
if (IS_ERR(vma))
return PTR_ERR(vma);
err = i915_vma_pin(vma, 0, 0, PIN_USER);
if (err)
return err;
rq = i915_request_create(engine->kernel_context);
if (IS_ERR(rq)) {
@ -544,12 +549,13 @@ static int make_obj_busy(struct drm_i915_gem_object *obj)
i915_vma_unlock(vma);
i915_request_add(rq);
i915_vma_unpin(vma);
if (err)
return err;
}
i915_vma_unpin(vma);
i915_gem_object_put(obj); /* leave it only alive via its active ref */
return err;
return 0;
}
static bool assert_mmap_offset(struct drm_i915_private *i915,

View File

@ -3,40 +3,241 @@
* Copyright © 2019 Intel Corporation
*/
#include <linux/sort.h>
#include "gt/intel_gt.h"
#include "gt/intel_engine_user.h"
#include "i915_selftest.h"
#include "gem/i915_gem_context.h"
#include "selftests/igt_flush_test.h"
#include "selftests/i915_random.h"
#include "selftests/mock_drm.h"
#include "huge_gem_object.h"
#include "mock_context.h"
static int igt_fill_blt(void *arg)
static int wrap_ktime_compare(const void *A, const void *B)
{
const ktime_t *a = A, *b = B;
return ktime_compare(*a, *b);
}
static int __perf_fill_blt(struct drm_i915_gem_object *obj)
{
struct drm_i915_private *i915 = to_i915(obj->base.dev);
int inst = 0;
do {
struct intel_engine_cs *engine;
ktime_t t[5];
int pass;
int err;
engine = intel_engine_lookup_user(i915,
I915_ENGINE_CLASS_COPY,
inst++);
if (!engine)
return 0;
for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
struct intel_context *ce = engine->kernel_context;
ktime_t t0, t1;
t0 = ktime_get();
err = i915_gem_object_fill_blt(obj, ce, 0);
if (err)
return err;
err = i915_gem_object_wait(obj,
I915_WAIT_ALL,
MAX_SCHEDULE_TIMEOUT);
if (err)
return err;
t1 = ktime_get();
t[pass] = ktime_sub(t1, t0);
}
sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
pr_info("%s: blt %zd KiB fill: %lld MiB/s\n",
engine->name,
obj->base.size >> 10,
div64_u64(mul_u32_u32(4 * obj->base.size,
1000 * 1000 * 1000),
t[1] + 2 * t[2] + t[3]) >> 20);
} while (1);
}
static int perf_fill_blt(void *arg)
{
struct drm_i915_private *i915 = arg;
struct intel_context *ce = i915->engine[BCS0]->kernel_context;
struct drm_i915_gem_object *obj;
static const unsigned long sizes[] = {
SZ_4K,
SZ_64K,
SZ_2M,
SZ_64M
};
int i;
for (i = 0; i < ARRAY_SIZE(sizes); i++) {
struct drm_i915_gem_object *obj;
int err;
obj = i915_gem_object_create_internal(i915, sizes[i]);
if (IS_ERR(obj))
return PTR_ERR(obj);
err = __perf_fill_blt(obj);
i915_gem_object_put(obj);
if (err)
return err;
}
return 0;
}
static int __perf_copy_blt(struct drm_i915_gem_object *src,
struct drm_i915_gem_object *dst)
{
struct drm_i915_private *i915 = to_i915(src->base.dev);
int inst = 0;
do {
struct intel_engine_cs *engine;
ktime_t t[5];
int pass;
engine = intel_engine_lookup_user(i915,
I915_ENGINE_CLASS_COPY,
inst++);
if (!engine)
return 0;
for (pass = 0; pass < ARRAY_SIZE(t); pass++) {
struct intel_context *ce = engine->kernel_context;
ktime_t t0, t1;
int err;
t0 = ktime_get();
err = i915_gem_object_copy_blt(src, dst, ce);
if (err)
return err;
err = i915_gem_object_wait(dst,
I915_WAIT_ALL,
MAX_SCHEDULE_TIMEOUT);
if (err)
return err;
t1 = ktime_get();
t[pass] = ktime_sub(t1, t0);
}
sort(t, ARRAY_SIZE(t), sizeof(*t), wrap_ktime_compare, NULL);
pr_info("%s: blt %zd KiB copy: %lld MiB/s\n",
engine->name,
src->base.size >> 10,
div64_u64(mul_u32_u32(4 * src->base.size,
1000 * 1000 * 1000),
t[1] + 2 * t[2] + t[3]) >> 20);
} while (1);
}
static int perf_copy_blt(void *arg)
{
struct drm_i915_private *i915 = arg;
static const unsigned long sizes[] = {
SZ_4K,
SZ_64K,
SZ_2M,
SZ_64M
};
int i;
for (i = 0; i < ARRAY_SIZE(sizes); i++) {
struct drm_i915_gem_object *src, *dst;
int err;
src = i915_gem_object_create_internal(i915, sizes[i]);
if (IS_ERR(src))
return PTR_ERR(src);
dst = i915_gem_object_create_internal(i915, sizes[i]);
if (IS_ERR(dst)) {
err = PTR_ERR(dst);
goto err_src;
}
err = __perf_copy_blt(src, dst);
i915_gem_object_put(dst);
err_src:
i915_gem_object_put(src);
if (err)
return err;
}
return 0;
}
struct igt_thread_arg {
struct drm_i915_private *i915;
struct rnd_state prng;
unsigned int n_cpus;
};
static int igt_fill_blt_thread(void *arg)
{
struct igt_thread_arg *thread = arg;
struct drm_i915_private *i915 = thread->i915;
struct rnd_state *prng = &thread->prng;
struct drm_i915_gem_object *obj;
struct i915_gem_context *ctx;
struct intel_context *ce;
struct drm_file *file;
unsigned int prio;
IGT_TIMEOUT(end);
u32 *vaddr;
int err = 0;
int err;
prandom_seed_state(&prng, i915_selftest.random_seed);
file = mock_file(i915);
if (IS_ERR(file))
return PTR_ERR(file);
/*
* XXX: needs some threads to scale all these tests, also maybe throw
* in submission from higher priority context to see if we are
* preempted for very large objects...
*/
ctx = live_context(i915, file);
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
goto out_file;
}
prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
ctx->sched.priority = I915_USER_PRIORITY(prio);
ce = i915_gem_context_get_engine(ctx, BCS0);
GEM_BUG_ON(IS_ERR(ce));
do {
const u32 max_block_size = S16_MAX * PAGE_SIZE;
u32 sz = min_t(u64, ce->vm->total >> 4, prandom_u32_state(&prng));
u32 phys_sz = sz % (max_block_size + 1);
u32 val = prandom_u32_state(&prng);
u32 val = prandom_u32_state(prng);
u64 total = ce->vm->total;
u32 phys_sz;
u32 sz;
u32 *vaddr;
u32 i;
/*
* If we have a tiny shared address space, like for the GGTT
* then we can't be too greedy.
*/
if (i915_is_ggtt(ce->vm))
total = div64_u64(total, thread->n_cpus);
sz = min_t(u64, total >> 4, prandom_u32_state(prng));
phys_sz = sz % (max_block_size + 1);
sz = round_up(sz, PAGE_SIZE);
phys_sz = round_up(phys_sz, PAGE_SIZE);
@ -98,28 +299,56 @@ err_flush:
if (err == -ENOMEM)
err = 0;
intel_context_put(ce);
out_file:
mock_file_free(i915, file);
return err;
}
static int igt_copy_blt(void *arg)
static int igt_copy_blt_thread(void *arg)
{
struct drm_i915_private *i915 = arg;
struct intel_context *ce = i915->engine[BCS0]->kernel_context;
struct igt_thread_arg *thread = arg;
struct drm_i915_private *i915 = thread->i915;
struct rnd_state *prng = &thread->prng;
struct drm_i915_gem_object *src, *dst;
struct rnd_state prng;
struct i915_gem_context *ctx;
struct intel_context *ce;
struct drm_file *file;
unsigned int prio;
IGT_TIMEOUT(end);
u32 *vaddr;
int err = 0;
int err;
prandom_seed_state(&prng, i915_selftest.random_seed);
file = mock_file(i915);
if (IS_ERR(file))
return PTR_ERR(file);
ctx = live_context(i915, file);
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
goto out_file;
}
prio = i915_prandom_u32_max_state(I915_PRIORITY_MAX, prng);
ctx->sched.priority = I915_USER_PRIORITY(prio);
ce = i915_gem_context_get_engine(ctx, BCS0);
GEM_BUG_ON(IS_ERR(ce));
do {
const u32 max_block_size = S16_MAX * PAGE_SIZE;
u32 sz = min_t(u64, ce->vm->total >> 4, prandom_u32_state(&prng));
u32 phys_sz = sz % (max_block_size + 1);
u32 val = prandom_u32_state(&prng);
u32 val = prandom_u32_state(prng);
u64 total = ce->vm->total;
u32 phys_sz;
u32 sz;
u32 *vaddr;
u32 i;
if (i915_is_ggtt(ce->vm))
total = div64_u64(total, thread->n_cpus);
sz = min_t(u64, total >> 4, prandom_u32_state(prng));
phys_sz = sz % (max_block_size + 1);
sz = round_up(sz, PAGE_SIZE);
phys_sz = round_up(phys_sz, PAGE_SIZE);
@ -201,12 +430,85 @@ err_flush:
if (err == -ENOMEM)
err = 0;
intel_context_put(ce);
out_file:
mock_file_free(i915, file);
return err;
}
static int igt_threaded_blt(struct drm_i915_private *i915,
int (*blt_fn)(void *arg))
{
struct igt_thread_arg *thread;
struct task_struct **tsk;
I915_RND_STATE(prng);
unsigned int n_cpus;
unsigned int i;
int err = 0;
n_cpus = num_online_cpus() + 1;
tsk = kcalloc(n_cpus, sizeof(struct task_struct *), GFP_KERNEL);
if (!tsk)
return 0;
thread = kcalloc(n_cpus, sizeof(struct igt_thread_arg), GFP_KERNEL);
if (!thread) {
kfree(tsk);
return 0;
}
for (i = 0; i < n_cpus; ++i) {
thread[i].i915 = i915;
thread[i].n_cpus = n_cpus;
thread[i].prng =
I915_RND_STATE_INITIALIZER(prandom_u32_state(&prng));
tsk[i] = kthread_run(blt_fn, &thread[i], "igt/blt-%d", i);
if (IS_ERR(tsk[i])) {
err = PTR_ERR(tsk[i]);
break;
}
get_task_struct(tsk[i]);
}
yield(); /* start all threads before we kthread_stop() */
for (i = 0; i < n_cpus; ++i) {
int status;
if (IS_ERR_OR_NULL(tsk[i]))
continue;
status = kthread_stop(tsk[i]);
if (status && !err)
err = status;
put_task_struct(tsk[i]);
}
kfree(tsk);
kfree(thread);
return err;
}
static int igt_fill_blt(void *arg)
{
return igt_threaded_blt(arg, igt_fill_blt_thread);
}
static int igt_copy_blt(void *arg)
{
return igt_threaded_blt(arg, igt_copy_blt_thread);
}
int i915_gem_object_blt_live_selftests(struct drm_i915_private *i915)
{
static const struct i915_subtest tests[] = {
SUBTEST(perf_fill_blt),
SUBTEST(perf_copy_blt),
SUBTEST(igt_fill_blt),
SUBTEST(igt_copy_blt),
};

View File

@ -22,6 +22,8 @@ mock_context(struct drm_i915_private *i915,
INIT_LIST_HEAD(&ctx->link);
ctx->i915 = i915;
i915_gem_context_set_persistence(ctx);
mutex_init(&ctx->engines_mutex);
e = default_engines(ctx);
if (IS_ERR(e))

View File

@ -13,6 +13,7 @@
#include "intel_context.h"
#include "intel_engine.h"
#include "intel_engine_pm.h"
#include "intel_ring.h"
static struct i915_global_context {
struct i915_global base;

View File

@ -12,6 +12,7 @@
#include "i915_active.h"
#include "intel_context_types.h"
#include "intel_engine_types.h"
#include "intel_ring_types.h"
#include "intel_timeline_types.h"
void intel_context_init(struct intel_context *ce,

View File

@ -19,6 +19,7 @@
#include "intel_workarounds.h"
struct drm_printer;
struct intel_gt;
/* Early gen2 devices have a cacheline of just 32 bytes, using 64 is overkill,
* but keeps the logic simple. Indeed, the whole purpose of this macro is just
@ -89,38 +90,6 @@ struct drm_printer;
/* seqno size is actually only a uint32, but since we plan to use MI_FLUSH_DW to
* do the writes, and that must have qw aligned offsets, simply pretend it's 8b.
*/
enum intel_engine_hangcheck_action {
ENGINE_IDLE = 0,
ENGINE_WAIT,
ENGINE_ACTIVE_SEQNO,
ENGINE_ACTIVE_HEAD,
ENGINE_ACTIVE_SUBUNITS,
ENGINE_WAIT_KICK,
ENGINE_DEAD,
};
static inline const char *
hangcheck_action_to_str(const enum intel_engine_hangcheck_action a)
{
switch (a) {
case ENGINE_IDLE:
return "idle";
case ENGINE_WAIT:
return "wait";
case ENGINE_ACTIVE_SEQNO:
return "active seqno";
case ENGINE_ACTIVE_HEAD:
return "active head";
case ENGINE_ACTIVE_SUBUNITS:
return "active subunits";
case ENGINE_WAIT_KICK:
return "wait kick";
case ENGINE_DEAD:
return "dead";
}
return "unknown";
}
static inline unsigned int
execlists_num_ports(const struct intel_engine_execlists * const execlists)
@ -206,126 +175,13 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
#define I915_HWS_CSB_WRITE_INDEX 0x1f
#define CNL_HWS_CSB_WRITE_INDEX 0x2f
struct intel_ring *
intel_engine_create_ring(struct intel_engine_cs *engine, int size);
int intel_ring_pin(struct intel_ring *ring);
void intel_ring_reset(struct intel_ring *ring, u32 tail);
unsigned int intel_ring_update_space(struct intel_ring *ring);
void intel_ring_unpin(struct intel_ring *ring);
void intel_ring_free(struct kref *ref);
static inline struct intel_ring *intel_ring_get(struct intel_ring *ring)
{
kref_get(&ring->ref);
return ring;
}
static inline void intel_ring_put(struct intel_ring *ring)
{
kref_put(&ring->ref, intel_ring_free);
}
void intel_engine_stop(struct intel_engine_cs *engine);
void intel_engine_cleanup(struct intel_engine_cs *engine);
int __must_check intel_ring_cacheline_align(struct i915_request *rq);
u32 __must_check *intel_ring_begin(struct i915_request *rq, unsigned int n);
static inline void intel_ring_advance(struct i915_request *rq, u32 *cs)
{
/* Dummy function.
*
* This serves as a placeholder in the code so that the reader
* can compare against the preceding intel_ring_begin() and
* check that the number of dwords emitted matches the space
* reserved for the command packet (i.e. the value passed to
* intel_ring_begin()).
*/
GEM_BUG_ON((rq->ring->vaddr + rq->ring->emit) != cs);
}
static inline u32 intel_ring_wrap(const struct intel_ring *ring, u32 pos)
{
return pos & (ring->size - 1);
}
static inline bool
intel_ring_offset_valid(const struct intel_ring *ring,
unsigned int pos)
{
if (pos & -ring->size) /* must be strictly within the ring */
return false;
if (!IS_ALIGNED(pos, 8)) /* must be qword aligned */
return false;
return true;
}
static inline u32 intel_ring_offset(const struct i915_request *rq, void *addr)
{
/* Don't write ring->size (equivalent to 0) as that hangs some GPUs. */
u32 offset = addr - rq->ring->vaddr;
GEM_BUG_ON(offset > rq->ring->size);
return intel_ring_wrap(rq->ring, offset);
}
static inline void
assert_ring_tail_valid(const struct intel_ring *ring, unsigned int tail)
{
GEM_BUG_ON(!intel_ring_offset_valid(ring, tail));
/*
* "Ring Buffer Use"
* Gen2 BSpec "1. Programming Environment" / 1.4.4.6
* Gen3 BSpec "1c Memory Interface Functions" / 2.3.4.5
* Gen4+ BSpec "1c Memory Interface and Command Stream" / 5.3.4.5
* "If the Ring Buffer Head Pointer and the Tail Pointer are on the
* same cacheline, the Head Pointer must not be greater than the Tail
* Pointer."
*
* We use ring->head as the last known location of the actual RING_HEAD,
* it may have advanced but in the worst case it is equally the same
* as ring->head and so we should never program RING_TAIL to advance
* into the same cacheline as ring->head.
*/
#define cacheline(a) round_down(a, CACHELINE_BYTES)
GEM_BUG_ON(cacheline(tail) == cacheline(ring->head) &&
tail < ring->head);
#undef cacheline
}
static inline unsigned int
intel_ring_set_tail(struct intel_ring *ring, unsigned int tail)
{
/* Whilst writes to the tail are strictly order, there is no
* serialisation between readers and the writers. The tail may be
* read by i915_request_retire() just as it is being updated
* by execlists, as although the breadcrumb is complete, the context
* switch hasn't been seen.
*/
assert_ring_tail_valid(ring, tail);
ring->tail = tail;
return tail;
}
static inline unsigned int
__intel_ring_space(unsigned int head, unsigned int tail, unsigned int size)
{
/*
* "If the Ring Buffer Head Pointer and the Tail Pointer are on the
* same cacheline, the Head Pointer must not be greater than the Tail
* Pointer."
*/
GEM_BUG_ON(!is_power_of_2(size));
return (head - tail - CACHELINE_BYTES) & (size - 1);
}
int intel_engines_init_mmio(struct drm_i915_private *i915);
int intel_engines_setup(struct drm_i915_private *i915);
int intel_engines_init(struct drm_i915_private *i915);
void intel_engines_cleanup(struct drm_i915_private *i915);
int intel_engines_init_mmio(struct intel_gt *gt);
int intel_engines_setup(struct intel_gt *gt);
int intel_engines_init(struct intel_gt *gt);
void intel_engines_cleanup(struct intel_gt *gt);
int intel_engine_init_common(struct intel_engine_cs *engine);
void intel_engine_cleanup_common(struct intel_engine_cs *engine);
@ -434,61 +290,6 @@ void intel_engine_dump(struct intel_engine_cs *engine,
struct drm_printer *m,
const char *header, ...);
static inline void intel_engine_context_in(struct intel_engine_cs *engine)
{
unsigned long flags;
if (READ_ONCE(engine->stats.enabled) == 0)
return;
write_seqlock_irqsave(&engine->stats.lock, flags);
if (engine->stats.enabled > 0) {
if (engine->stats.active++ == 0)
engine->stats.start = ktime_get();
GEM_BUG_ON(engine->stats.active == 0);
}
write_sequnlock_irqrestore(&engine->stats.lock, flags);
}
static inline void intel_engine_context_out(struct intel_engine_cs *engine)
{
unsigned long flags;
if (READ_ONCE(engine->stats.enabled) == 0)
return;
write_seqlock_irqsave(&engine->stats.lock, flags);
if (engine->stats.enabled > 0) {
ktime_t last;
if (engine->stats.active && --engine->stats.active == 0) {
/*
* Decrement the active context count and in case GPU
* is now idle add up to the running total.
*/
last = ktime_sub(ktime_get(), engine->stats.start);
engine->stats.total = ktime_add(engine->stats.total,
last);
} else if (engine->stats.active == 0) {
/*
* After turning on engine stats, context out might be
* the first event in which case we account from the
* time stats gathering was turned on.
*/
last = ktime_sub(ktime_get(), engine->stats.enabled_at);
engine->stats.total = ktime_add(engine->stats.total,
last);
}
}
write_sequnlock_irqrestore(&engine->stats.lock, flags);
}
int intel_enable_engine_stats(struct intel_engine_cs *engine);
void intel_disable_engine_stats(struct intel_engine_cs *engine);
@ -525,4 +326,22 @@ void intel_engine_init_active(struct intel_engine_cs *engine,
#define ENGINE_MOCK 1
#define ENGINE_VIRTUAL 2
static inline bool
intel_engine_has_preempt_reset(const struct intel_engine_cs *engine)
{
if (!IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT))
return false;
return intel_engine_has_preemption(engine);
}
static inline bool
intel_engine_has_timeslices(const struct intel_engine_cs *engine)
{
if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
return false;
return intel_engine_has_semaphores(engine);
}
#endif /* _INTEL_RINGBUFFER_H_ */

View File

@ -37,6 +37,7 @@
#include "intel_context.h"
#include "intel_lrc.h"
#include "intel_reset.h"
#include "intel_ring.h"
/* Haswell does have the CXT_SIZE register however it does not appear to be
* valid. Now, docs explain in dwords what is in the context object. The full
@ -308,6 +309,15 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
engine->instance = info->instance;
__sprint_engine_name(engine);
engine->props.heartbeat_interval_ms =
CONFIG_DRM_I915_HEARTBEAT_INTERVAL;
engine->props.preempt_timeout_ms =
CONFIG_DRM_I915_PREEMPT_TIMEOUT;
engine->props.stop_timeout_ms =
CONFIG_DRM_I915_STOP_TIMEOUT;
engine->props.timeslice_duration_ms =
CONFIG_DRM_I915_TIMESLICE_DURATION;
/*
* To be overridden by the backend on setup. However to facilitate
* cleanup on error during setup, we always provide the destroy vfunc.
@ -370,38 +380,40 @@ static void __setup_engine_capabilities(struct intel_engine_cs *engine)
}
}
static void intel_setup_engine_capabilities(struct drm_i915_private *i915)
static void intel_setup_engine_capabilities(struct intel_gt *gt)
{
struct intel_engine_cs *engine;
enum intel_engine_id id;
for_each_engine(engine, i915, id)
for_each_engine(engine, gt, id)
__setup_engine_capabilities(engine);
}
/**
* intel_engines_cleanup() - free the resources allocated for Command Streamers
* @i915: the i915 devic
* @gt: pointer to struct intel_gt
*/
void intel_engines_cleanup(struct drm_i915_private *i915)
void intel_engines_cleanup(struct intel_gt *gt)
{
struct intel_engine_cs *engine;
enum intel_engine_id id;
for_each_engine(engine, i915, id) {
for_each_engine(engine, gt, id) {
engine->destroy(engine);
i915->engine[id] = NULL;
gt->engine[id] = NULL;
gt->i915->engine[id] = NULL;
}
}
/**
* intel_engines_init_mmio() - allocate and prepare the Engine Command Streamers
* @i915: the i915 device
* @gt: pointer to struct intel_gt
*
* Return: non-zero if the initialization failed.
*/
int intel_engines_init_mmio(struct drm_i915_private *i915)
int intel_engines_init_mmio(struct intel_gt *gt)
{
struct drm_i915_private *i915 = gt->i915;
struct intel_device_info *device_info = mkwrite_device_info(i915);
const unsigned int engine_mask = INTEL_INFO(i915)->engine_mask;
unsigned int mask = 0;
@ -419,7 +431,7 @@ int intel_engines_init_mmio(struct drm_i915_private *i915)
if (!HAS_ENGINE(i915, i))
continue;
err = intel_engine_setup(&i915->gt, i);
err = intel_engine_setup(gt, i);
if (err)
goto cleanup;
@ -436,36 +448,36 @@ int intel_engines_init_mmio(struct drm_i915_private *i915)
RUNTIME_INFO(i915)->num_engines = hweight32(mask);
intel_gt_check_and_clear_faults(&i915->gt);
intel_gt_check_and_clear_faults(gt);
intel_setup_engine_capabilities(i915);
intel_setup_engine_capabilities(gt);
return 0;
cleanup:
intel_engines_cleanup(i915);
intel_engines_cleanup(gt);
return err;
}
/**
* intel_engines_init() - init the Engine Command Streamers
* @i915: i915 device private
* @gt: pointer to struct intel_gt
*
* Return: non-zero if the initialization failed.
*/
int intel_engines_init(struct drm_i915_private *i915)
int intel_engines_init(struct intel_gt *gt)
{
int (*init)(struct intel_engine_cs *engine);
struct intel_engine_cs *engine;
enum intel_engine_id id;
int err;
if (HAS_EXECLISTS(i915))
if (HAS_EXECLISTS(gt->i915))
init = intel_execlists_submission_init;
else
init = intel_ring_submission_init;
for_each_engine(engine, i915, id) {
for_each_engine(engine, gt, id) {
err = init(engine);
if (err)
goto cleanup;
@ -474,7 +486,7 @@ int intel_engines_init(struct drm_i915_private *i915)
return 0;
cleanup:
intel_engines_cleanup(i915);
intel_engines_cleanup(gt);
return err;
}
@ -518,7 +530,7 @@ static int pin_ggtt_status_page(struct intel_engine_cs *engine,
unsigned int flags;
flags = PIN_GLOBAL;
if (!HAS_LLC(engine->i915))
if (!HAS_LLC(engine->i915) && i915_ggtt_has_aperture(engine->gt->ggtt))
/*
* On g33, we cannot place HWS above 256MiB, so
* restrict its pinning to the low mappable arena.
@ -602,7 +614,6 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
intel_engine_init_active(engine, ENGINE_PHYSICAL);
intel_engine_init_breadcrumbs(engine);
intel_engine_init_execlists(engine);
intel_engine_init_hangcheck(engine);
intel_engine_init_cmd_parser(engine);
intel_engine_init__pm(engine);
@ -621,26 +632,26 @@ static int intel_engine_setup_common(struct intel_engine_cs *engine)
/**
* intel_engines_setup- setup engine state not requiring hw access
* @i915: Device to setup.
* @gt: pointer to struct intel_gt
*
* Initializes engine structure members shared between legacy and execlists
* submission modes which do not require hardware access.
*
* Typically done early in the submission mode specific engine setup stage.
*/
int intel_engines_setup(struct drm_i915_private *i915)
int intel_engines_setup(struct intel_gt *gt)
{
int (*setup)(struct intel_engine_cs *engine);
struct intel_engine_cs *engine;
enum intel_engine_id id;
int err;
if (HAS_EXECLISTS(i915))
if (HAS_EXECLISTS(gt->i915))
setup = intel_execlists_submission_setup;
else
setup = intel_ring_submission_setup;
for_each_engine(engine, i915, id) {
for_each_engine(engine, gt, id) {
err = intel_engine_setup_common(engine);
if (err)
goto cleanup;
@ -658,7 +669,7 @@ int intel_engines_setup(struct drm_i915_private *i915)
return 0;
cleanup:
intel_engines_cleanup(i915);
intel_engines_cleanup(gt);
return err;
}
@ -873,6 +884,21 @@ u64 intel_engine_get_last_batch_head(const struct intel_engine_cs *engine)
return bbaddr;
}
static unsigned long stop_timeout(const struct intel_engine_cs *engine)
{
if (in_atomic() || irqs_disabled()) /* inside atomic preempt-reset? */
return 0;
/*
* If we are doing a normal GPU reset, we can take our time and allow
* the engine to quiesce. We've stopped submission to the engine, and
* if we wait long enough an innocent context should complete and
* leave the engine idle. So they should not be caught unaware by
* the forthcoming GPU reset (which usually follows the stop_cs)!
*/
return READ_ONCE(engine->props.stop_timeout_ms);
}
int intel_engine_stop_cs(struct intel_engine_cs *engine)
{
struct intel_uncore *uncore = engine->uncore;
@ -890,7 +916,7 @@ int intel_engine_stop_cs(struct intel_engine_cs *engine)
err = 0;
if (__intel_wait_for_register_fw(uncore,
mode, MODE_IDLE, MODE_IDLE,
1000, 0,
1000, stop_timeout(engine),
NULL)) {
GEM_TRACE("%s: timed out on STOP_RING -> IDLE\n", engine->name);
err = -ETIMEDOUT;
@ -1318,10 +1344,11 @@ static void intel_engine_print_registers(struct intel_engine_cs *engine,
unsigned int idx;
u8 read, write;
drm_printf(m, "\tExeclist tasklet queued? %s (%s), timeslice? %s\n",
drm_printf(m, "\tExeclist tasklet queued? %s (%s), preempt? %s, timeslice? %s\n",
yesno(test_bit(TASKLET_STATE_SCHED,
&engine->execlists.tasklet.state)),
enableddisabled(!atomic_read(&engine->execlists.tasklet.count)),
repr_timer(&engine->execlists.preempt),
repr_timer(&engine->execlists.timer));
read = execlists->csb_head;
@ -1447,8 +1474,13 @@ void intel_engine_dump(struct intel_engine_cs *engine,
drm_printf(m, "*** WEDGED ***\n");
drm_printf(m, "\tAwake? %d\n", atomic_read(&engine->wakeref.count));
drm_printf(m, "\tHangcheck: %d ms ago\n",
jiffies_to_msecs(jiffies - engine->hangcheck.action_timestamp));
rcu_read_lock();
rq = READ_ONCE(engine->heartbeat.systole);
if (rq)
drm_printf(m, "\tHeartbeat: %d ms ago\n",
jiffies_to_msecs(jiffies - rq->emitted_jiffies));
rcu_read_unlock();
drm_printf(m, "\tReset count: %d (global %d)\n",
i915_reset_engine_count(error, engine),
i915_reset_count(error));

View File

@ -0,0 +1,234 @@
/*
* SPDX-License-Identifier: MIT
*
* Copyright © 2019 Intel Corporation
*/
#include "i915_request.h"
#include "intel_context.h"
#include "intel_engine_heartbeat.h"
#include "intel_engine_pm.h"
#include "intel_engine.h"
#include "intel_gt.h"
#include "intel_reset.h"
/*
* While the engine is active, we send a periodic pulse along the engine
* to check on its health and to flush any idle-barriers. If that request
* is stuck, and we fail to preempt it, we declare the engine hung and
* issue a reset -- in the hope that restores progress.
*/
static bool next_heartbeat(struct intel_engine_cs *engine)
{
long delay;
delay = READ_ONCE(engine->props.heartbeat_interval_ms);
if (!delay)
return false;
delay = msecs_to_jiffies_timeout(delay);
if (delay >= HZ)
delay = round_jiffies_up_relative(delay);
schedule_delayed_work(&engine->heartbeat.work, delay);
return true;
}
static void idle_pulse(struct intel_engine_cs *engine, struct i915_request *rq)
{
engine->wakeref_serial = READ_ONCE(engine->serial) + 1;
i915_request_add_active_barriers(rq);
}
static void show_heartbeat(const struct i915_request *rq,
struct intel_engine_cs *engine)
{
struct drm_printer p = drm_debug_printer("heartbeat");
intel_engine_dump(engine, &p,
"%s heartbeat {prio:%d} not ticking\n",
engine->name,
rq->sched.attr.priority);
}
static void heartbeat(struct work_struct *wrk)
{
struct i915_sched_attr attr = {
.priority = I915_USER_PRIORITY(I915_PRIORITY_MIN),
};
struct intel_engine_cs *engine =
container_of(wrk, typeof(*engine), heartbeat.work.work);
struct intel_context *ce = engine->kernel_context;
struct i915_request *rq;
if (!intel_engine_pm_get_if_awake(engine))
return;
rq = engine->heartbeat.systole;
if (rq && i915_request_completed(rq)) {
i915_request_put(rq);
engine->heartbeat.systole = NULL;
}
if (intel_gt_is_wedged(engine->gt))
goto out;
if (engine->heartbeat.systole) {
if (engine->schedule &&
rq->sched.attr.priority < I915_PRIORITY_BARRIER) {
/*
* Gradually raise the priority of the heartbeat to
* give high priority work [which presumably desires
* low latency and no jitter] the chance to naturally
* complete before being preempted.
*/
attr.priority = I915_PRIORITY_MASK;
if (rq->sched.attr.priority >= attr.priority)
attr.priority |= I915_USER_PRIORITY(I915_PRIORITY_HEARTBEAT);
if (rq->sched.attr.priority >= attr.priority)
attr.priority = I915_PRIORITY_BARRIER;
local_bh_disable();
engine->schedule(rq, &attr);
local_bh_enable();
} else {
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
show_heartbeat(rq, engine);
intel_gt_handle_error(engine->gt, engine->mask,
I915_ERROR_CAPTURE,
"stopped heartbeat on %s",
engine->name);
}
goto out;
}
if (engine->wakeref_serial == engine->serial)
goto out;
mutex_lock(&ce->timeline->mutex);
intel_context_enter(ce);
rq = __i915_request_create(ce, GFP_NOWAIT | __GFP_NOWARN);
intel_context_exit(ce);
if (IS_ERR(rq))
goto unlock;
idle_pulse(engine, rq);
if (i915_modparams.enable_hangcheck)
engine->heartbeat.systole = i915_request_get(rq);
__i915_request_commit(rq);
__i915_request_queue(rq, &attr);
unlock:
mutex_unlock(&ce->timeline->mutex);
out:
if (!next_heartbeat(engine))
i915_request_put(fetch_and_zero(&engine->heartbeat.systole));
intel_engine_pm_put(engine);
}
void intel_engine_unpark_heartbeat(struct intel_engine_cs *engine)
{
if (!IS_ACTIVE(CONFIG_DRM_I915_HEARTBEAT_INTERVAL))
return;
next_heartbeat(engine);
}
void intel_engine_park_heartbeat(struct intel_engine_cs *engine)
{
cancel_delayed_work(&engine->heartbeat.work);
i915_request_put(fetch_and_zero(&engine->heartbeat.systole));
}
void intel_engine_init_heartbeat(struct intel_engine_cs *engine)
{
INIT_DELAYED_WORK(&engine->heartbeat.work, heartbeat);
}
int intel_engine_set_heartbeat(struct intel_engine_cs *engine,
unsigned long delay)
{
int err;
/* Send one last pulse before to cleanup persistent hogs */
if (!delay && IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT)) {
err = intel_engine_pulse(engine);
if (err)
return err;
}
WRITE_ONCE(engine->props.heartbeat_interval_ms, delay);
if (intel_engine_pm_get_if_awake(engine)) {
if (delay)
intel_engine_unpark_heartbeat(engine);
else
intel_engine_park_heartbeat(engine);
intel_engine_pm_put(engine);
}
return 0;
}
int intel_engine_pulse(struct intel_engine_cs *engine)
{
struct i915_sched_attr attr = { .priority = I915_PRIORITY_BARRIER };
struct intel_context *ce = engine->kernel_context;
struct i915_request *rq;
int err = 0;
if (!intel_engine_has_preemption(engine))
return -ENODEV;
if (!intel_engine_pm_get_if_awake(engine))
return 0;
if (mutex_lock_interruptible(&ce->timeline->mutex))
goto out_rpm;
intel_context_enter(ce);
rq = __i915_request_create(ce, GFP_NOWAIT | __GFP_NOWARN);
intel_context_exit(ce);
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
goto out_unlock;
}
rq->flags |= I915_REQUEST_SENTINEL;
idle_pulse(engine, rq);
__i915_request_commit(rq);
__i915_request_queue(rq, &attr);
out_unlock:
mutex_unlock(&ce->timeline->mutex);
out_rpm:
intel_engine_pm_put(engine);
return err;
}
int intel_engine_flush_barriers(struct intel_engine_cs *engine)
{
struct i915_request *rq;
if (llist_empty(&engine->barrier_tasks))
return 0;
rq = i915_request_create(engine->kernel_context);
if (IS_ERR(rq))
return PTR_ERR(rq);
idle_pulse(engine, rq);
i915_request_add(rq);
return 0;
}
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
#include "selftest_engine_heartbeat.c"
#endif

View File

@ -0,0 +1,23 @@
/*
* SPDX-License-Identifier: MIT
*
* Copyright © 2019 Intel Corporation
*/
#ifndef INTEL_ENGINE_HEARTBEAT_H
#define INTEL_ENGINE_HEARTBEAT_H
struct intel_engine_cs;
void intel_engine_init_heartbeat(struct intel_engine_cs *engine);
int intel_engine_set_heartbeat(struct intel_engine_cs *engine,
unsigned long delay);
void intel_engine_park_heartbeat(struct intel_engine_cs *engine);
void intel_engine_unpark_heartbeat(struct intel_engine_cs *engine);
int intel_engine_pulse(struct intel_engine_cs *engine);
int intel_engine_flush_barriers(struct intel_engine_cs *engine);
#endif /* INTEL_ENGINE_HEARTBEAT_H */

View File

@ -7,11 +7,13 @@
#include "i915_drv.h"
#include "intel_engine.h"
#include "intel_engine_heartbeat.h"
#include "intel_engine_pm.h"
#include "intel_engine_pool.h"
#include "intel_gt.h"
#include "intel_gt_pm.h"
#include "intel_rc6.h"
#include "intel_ring.h"
static int __engine_unpark(struct intel_wakeref *wf)
{
@ -34,7 +36,7 @@ static int __engine_unpark(struct intel_wakeref *wf)
if (engine->unpark)
engine->unpark(engine);
intel_engine_init_hangcheck(engine);
intel_engine_unpark_heartbeat(engine);
return 0;
}
@ -111,7 +113,7 @@ static bool switch_to_kernel_context(struct intel_engine_cs *engine)
i915_request_add_active_barriers(rq);
/* Install ourselves as a preemption barrier */
rq->sched.attr.priority = I915_PRIORITY_UNPREEMPTABLE;
rq->sched.attr.priority = I915_PRIORITY_BARRIER;
__i915_request_commit(rq);
/* Release our exclusive hold on the engine */
@ -158,6 +160,7 @@ static int __engine_park(struct intel_wakeref *wf)
call_idle_barriers(engine); /* cleanup after wedging */
intel_engine_park_heartbeat(engine);
intel_engine_disarm_breadcrumbs(engine);
intel_engine_pool_park(&engine->pool);
@ -188,6 +191,7 @@ void intel_engine_init__pm(struct intel_engine_cs *engine)
struct intel_runtime_pm *rpm = engine->uncore->rpm;
intel_wakeref_init(&engine->wakeref, rpm, &wf_ops);
intel_engine_init_heartbeat(engine);
}
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)

View File

@ -15,6 +15,7 @@
#include <linux/rbtree.h>
#include <linux/timer.h>
#include <linux/types.h>
#include <linux/workqueue.h>
#include "i915_gem.h"
#include "i915_pmu.h"
@ -58,6 +59,7 @@ struct i915_gem_context;
struct i915_request;
struct i915_sched_attr;
struct intel_gt;
struct intel_ring;
struct intel_uncore;
typedef u8 intel_engine_mask_t;
@ -76,40 +78,6 @@ struct intel_instdone {
u32 row[I915_MAX_SLICES][I915_MAX_SUBSLICES];
};
struct intel_engine_hangcheck {
u64 acthd;
u32 last_ring;
u32 last_head;
unsigned long action_timestamp;
struct intel_instdone instdone;
};
struct intel_ring {
struct kref ref;
struct i915_vma *vma;
void *vaddr;
/*
* As we have two types of rings, one global to the engine used
* by ringbuffer submission and those that are exclusive to a
* context used by execlists, we have to play safe and allow
* atomic updates to the pin_count. However, the actual pinning
* of the context is either done during initialisation for
* ringbuffer submission or serialised as part of the context
* pinning for execlists, and so we do not need a mutex ourselves
* to serialise intel_ring_pin/intel_ring_unpin.
*/
atomic_t pin_count;
u32 head;
u32 tail;
u32 emit;
u32 space;
u32 size;
u32 effective_size;
};
/*
* we use a single page to load ctx workarounds so all of these
* values are referred in terms of dwords
@ -174,6 +142,11 @@ struct intel_engine_execlists {
*/
struct timer_list timer;
/**
* @preempt: reset the current context if it fails to give way
*/
struct timer_list preempt;
/**
* @default_priolist: priority list for I915_PRIORITY_NORMAL
*/
@ -326,6 +299,11 @@ struct intel_engine_cs {
intel_engine_mask_t saturated; /* submitting semaphores too late? */
struct {
struct delayed_work work;
struct i915_request *systole;
} heartbeat;
unsigned long serial;
unsigned long wakeref_serial;
@ -476,8 +454,6 @@ struct intel_engine_cs {
/* status_notifier: list of callbacks for context-switch changes */
struct atomic_notifier_head context_status_notifier;
struct intel_engine_hangcheck hangcheck;
#define I915_ENGINE_NEEDS_CMD_PARSER BIT(0)
#define I915_ENGINE_SUPPORTS_STATS BIT(1)
#define I915_ENGINE_HAS_PREEMPTION BIT(2)
@ -542,6 +518,13 @@ struct intel_engine_cs {
*/
ktime_t total;
} stats;
struct {
unsigned long heartbeat_interval_ms;
unsigned long preempt_timeout_ms;
unsigned long stop_timeout_ms;
unsigned long timeslice_duration_ms;
} props;
};
static inline bool

View File

@ -9,6 +9,7 @@
#include "intel_gt_requests.h"
#include "intel_mocs.h"
#include "intel_rc6.h"
#include "intel_rps.h"
#include "intel_uncore.h"
#include "intel_pm.h"
@ -22,19 +23,17 @@ void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915)
INIT_LIST_HEAD(&gt->closed_vma);
spin_lock_init(&gt->closed_lock);
intel_gt_init_hangcheck(gt);
intel_gt_init_reset(gt);
intel_gt_init_requests(gt);
intel_gt_pm_init_early(gt);
intel_rps_init_early(&gt->rps);
intel_uc_init_early(&gt->uc);
}
void intel_gt_init_hw_early(struct drm_i915_private *i915)
{
i915->gt.ggtt = &i915->ggtt;
/* BIOS often leaves RC6 enabled, but disable it for hw init */
intel_gt_pm_disable(&i915->gt);
}
static void init_unused_ring(struct intel_gt *gt, u32 base)
@ -321,8 +320,7 @@ void intel_gt_chipset_flush(struct intel_gt *gt)
void intel_gt_driver_register(struct intel_gt *gt)
{
if (IS_GEN(gt->i915, 5))
intel_gpu_ips_init(gt->i915);
intel_rps_driver_register(&gt->rps);
}
static int intel_gt_init_scratch(struct intel_gt *gt, unsigned int size)
@ -380,20 +378,16 @@ int intel_gt_init(struct intel_gt *gt)
void intel_gt_driver_remove(struct intel_gt *gt)
{
GEM_BUG_ON(gt->awake);
intel_gt_pm_disable(gt);
}
void intel_gt_driver_unregister(struct intel_gt *gt)
{
intel_gpu_ips_teardown();
intel_rps_driver_unregister(&gt->rps);
}
void intel_gt_driver_release(struct intel_gt *gt)
{
/* Paranoia: make sure we have disabled everything before we exit. */
intel_gt_pm_disable(gt);
intel_gt_pm_fini(gt);
intel_gt_fini_scratch(gt);
}

View File

@ -46,8 +46,6 @@ void intel_gt_clear_error_registers(struct intel_gt *gt,
void intel_gt_flush_ggtt_writes(struct intel_gt *gt);
void intel_gt_chipset_flush(struct intel_gt *gt);
void intel_gt_init_hangcheck(struct intel_gt *gt);
static inline u32 intel_gt_scratch_offset(const struct intel_gt *gt,
enum intel_gt_scratch_field field)
{
@ -59,6 +57,4 @@ static inline bool intel_gt_is_wedged(struct intel_gt *gt)
return __intel_reset_failed(&gt->reset);
}
void intel_gt_queue_hangcheck(struct intel_gt *gt);
#endif /* __INTEL_GT_H__ */

View File

@ -11,6 +11,7 @@
#include "intel_gt.h"
#include "intel_gt_irq.h"
#include "intel_uncore.h"
#include "intel_rps.h"
static void guc_irq_handler(struct intel_guc *guc, u16 iir)
{
@ -77,7 +78,7 @@ gen11_other_irq_handler(struct intel_gt *gt, const u8 instance,
return guc_irq_handler(&gt->uc.guc, iir);
if (instance == OTHER_GTPM_INSTANCE)
return gen11_rps_irq_handler(gt, iir);
return gen11_rps_irq_handler(&gt->rps, iir);
WARN_ONCE(1, "unhandled other interrupt instance=0x%x, iir=0x%x\n",
instance, iir);
@ -336,7 +337,7 @@ void gen8_gt_irq_handler(struct intel_gt *gt, u32 master_ctl, u32 gt_iir[4])
}
if (master_ctl & (GEN8_GT_PM_IRQ | GEN8_GT_GUC_IRQ)) {
gen6_rps_irq_handler(gt->i915, gt_iir[2]);
gen6_rps_irq_handler(&gt->rps, gt_iir[2]);
guc_irq_handler(&gt->uc.guc, gt_iir[2] >> 16);
}
}

View File

@ -12,15 +12,12 @@
#include "intel_gt.h"
#include "intel_gt_pm.h"
#include "intel_gt_requests.h"
#include "intel_llc.h"
#include "intel_pm.h"
#include "intel_rc6.h"
#include "intel_rps.h"
#include "intel_wakeref.h"
static void pm_notify(struct intel_gt *gt, int state)
{
blocking_notifier_call_chain(&gt->pm_notifications, state, gt->i915);
}
static int __gt_unpark(struct intel_wakeref *wf)
{
struct intel_gt *gt = container_of(wf, typeof(*gt), wakeref);
@ -44,19 +41,11 @@ static int __gt_unpark(struct intel_wakeref *wf)
gt->awake = intel_display_power_get(i915, POWER_DOMAIN_GT_IRQ);
GEM_BUG_ON(!gt->awake);
intel_enable_gt_powersave(i915);
i915_update_gfx_val(i915);
if (INTEL_GEN(i915) >= 6)
gen6_rps_busy(i915);
intel_rps_unpark(&gt->rps);
i915_pmu_gt_unparked(i915);
intel_gt_queue_hangcheck(gt);
intel_gt_unpark_requests(gt);
pm_notify(gt, INTEL_GT_UNPARK);
return 0;
}
@ -68,12 +57,11 @@ static int __gt_park(struct intel_wakeref *wf)
GEM_TRACE("\n");
pm_notify(gt, INTEL_GT_PARK);
intel_gt_park_requests(gt);
i915_vma_parked(gt);
i915_pmu_gt_parked(i915);
if (INTEL_GEN(i915) >= 6)
gen6_rps_idle(i915);
intel_rps_park(&gt->rps);
/* Everything switched off, flush any residual interrupt just in case */
intel_synchronize_irq(i915);
@ -95,8 +83,6 @@ static const struct intel_wakeref_ops wf_ops = {
void intel_gt_pm_init_early(struct intel_gt *gt)
{
intel_wakeref_init(&gt->wakeref, gt->uncore->rpm, &wf_ops);
BLOCKING_INIT_NOTIFIER_HEAD(&gt->pm_notifications);
}
void intel_gt_pm_init(struct intel_gt *gt)
@ -107,6 +93,7 @@ void intel_gt_pm_init(struct intel_gt *gt)
* user.
*/
intel_rc6_init(&gt->rc6);
intel_rps_init(&gt->rps);
}
static bool reset_engines(struct intel_gt *gt)
@ -150,12 +137,6 @@ void intel_gt_sanitize(struct intel_gt *gt, bool force)
engine->reset.finish(engine);
}
void intel_gt_pm_disable(struct intel_gt *gt)
{
if (!is_mock_gt(gt))
intel_sanitize_gt_powersave(gt->i915);
}
void intel_gt_pm_fini(struct intel_gt *gt)
{
intel_rc6_fini(&gt->rc6);
@ -174,9 +155,13 @@ int intel_gt_resume(struct intel_gt *gt)
* allowing us to fixup the user contexts on their first pin.
*/
intel_gt_pm_get(gt);
intel_uncore_forcewake_get(gt->uncore, FORCEWAKE_ALL);
intel_rc6_sanitize(&gt->rc6);
intel_rps_enable(&gt->rps);
intel_llc_enable(&gt->llc);
for_each_engine(engine, gt, id) {
struct intel_context *ce;
@ -185,9 +170,7 @@ int intel_gt_resume(struct intel_gt *gt)
ce = engine->kernel_context;
if (ce) {
GEM_BUG_ON(!intel_context_is_pinned(ce));
mutex_acquire(&ce->pin_mutex.dep_map, 0, 0, _THIS_IP_);
ce->ops->reset(ce);
mutex_release(&ce->pin_mutex.dep_map, 0, _THIS_IP_);
}
engine->serial++; /* kernel context lost */
@ -229,8 +212,11 @@ void intel_gt_suspend(struct intel_gt *gt)
/* We expect to be idle already; but also want to be independent */
wait_for_idle(gt);
with_intel_runtime_pm(gt->uncore->rpm, wakeref)
with_intel_runtime_pm(gt->uncore->rpm, wakeref) {
intel_rps_disable(&gt->rps);
intel_rc6_disable(&gt->rc6);
intel_llc_disable(&gt->llc);
}
}
void intel_gt_runtime_suspend(struct intel_gt *gt)

View File

@ -12,11 +12,6 @@
#include "intel_gt_types.h"
#include "intel_wakeref.h"
enum {
INTEL_GT_UNPARK,
INTEL_GT_PARK,
};
static inline bool intel_gt_pm_is_awake(const struct intel_gt *gt)
{
return intel_wakeref_is_active(&gt->wakeref);
@ -44,7 +39,6 @@ static inline int intel_gt_pm_wait_for_idle(struct intel_gt *gt)
void intel_gt_pm_init_early(struct intel_gt *gt);
void intel_gt_pm_init(struct intel_gt *gt);
void intel_gt_pm_disable(struct intel_gt *gt);
void intel_gt_pm_fini(struct intel_gt *gt);
void intel_gt_sanitize(struct intel_gt *gt, bool force);

View File

@ -20,6 +20,7 @@
#include "intel_llc_types.h"
#include "intel_reset_types.h"
#include "intel_rc6_types.h"
#include "intel_rps_types.h"
#include "intel_wakeref.h"
struct drm_i915_private;
@ -27,14 +28,6 @@ struct i915_ggtt;
struct intel_engine_cs;
struct intel_uncore;
struct intel_hangcheck {
/* For hangcheck timer */
#define DRM_I915_HANGCHECK_PERIOD 1500 /* in ms */
#define DRM_I915_HANGCHECK_JIFFIES msecs_to_jiffies(DRM_I915_HANGCHECK_PERIOD)
struct delayed_work work;
};
struct intel_gt {
struct drm_i915_private *i915;
struct intel_uncore *uncore;
@ -68,7 +61,6 @@ struct intel_gt {
struct list_head closed_vma;
spinlock_t closed_lock; /* guards the list of closed_vma */
struct intel_hangcheck hangcheck;
struct intel_reset reset;
/**
@ -82,8 +74,7 @@ struct intel_gt {
struct intel_llc llc;
struct intel_rc6 rc6;
struct blocking_notifier_head pm_notifications;
struct intel_rps rps;
ktime_t last_init_time;

View File

@ -1,361 +0,0 @@
/*
* Copyright © 2016 Intel Corporation
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the "Software"),
* to deal in the Software without restriction, including without limitation
* the rights to use, copy, modify, merge, publish, distribute, sublicense,
* and/or sell copies of the Software, and to permit persons to whom the
* Software is furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice (including the next
* paragraph) shall be included in all copies or substantial portions of the
* Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
* FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
* IN THE SOFTWARE.
*
*/
#include "i915_drv.h"
#include "intel_engine.h"
#include "intel_gt.h"
#include "intel_reset.h"
struct hangcheck {
u64 acthd;
u32 ring;
u32 head;
enum intel_engine_hangcheck_action action;
unsigned long action_timestamp;
int deadlock;
struct intel_instdone instdone;
bool wedged:1;
bool stalled:1;
};
static bool instdone_unchanged(u32 current_instdone, u32 *old_instdone)
{
u32 tmp = current_instdone | *old_instdone;
bool unchanged;
unchanged = tmp == *old_instdone;
*old_instdone |= tmp;
return unchanged;
}
static bool subunits_stuck(struct intel_engine_cs *engine)
{
struct drm_i915_private *dev_priv = engine->i915;
const struct sseu_dev_info *sseu = &RUNTIME_INFO(dev_priv)->sseu;
struct intel_instdone instdone;
struct intel_instdone *accu_instdone = &engine->hangcheck.instdone;
bool stuck;
int slice;
int subslice;
intel_engine_get_instdone(engine, &instdone);
/* There might be unstable subunit states even when
* actual head is not moving. Filter out the unstable ones by
* accumulating the undone -> done transitions and only
* consider those as progress.
*/
stuck = instdone_unchanged(instdone.instdone,
&accu_instdone->instdone);
stuck &= instdone_unchanged(instdone.slice_common,
&accu_instdone->slice_common);
for_each_instdone_slice_subslice(dev_priv, sseu, slice, subslice) {
stuck &= instdone_unchanged(instdone.sampler[slice][subslice],
&accu_instdone->sampler[slice][subslice]);
stuck &= instdone_unchanged(instdone.row[slice][subslice],
&accu_instdone->row[slice][subslice]);
}
return stuck;
}
static enum intel_engine_hangcheck_action
head_stuck(struct intel_engine_cs *engine, u64 acthd)
{
if (acthd != engine->hangcheck.acthd) {
/* Clear subunit states on head movement */
memset(&engine->hangcheck.instdone, 0,
sizeof(engine->hangcheck.instdone));
return ENGINE_ACTIVE_HEAD;
}
if (!subunits_stuck(engine))
return ENGINE_ACTIVE_SUBUNITS;
return ENGINE_DEAD;
}
static enum intel_engine_hangcheck_action
engine_stuck(struct intel_engine_cs *engine, u64 acthd)
{
enum intel_engine_hangcheck_action ha;
u32 tmp;
ha = head_stuck(engine, acthd);
if (ha != ENGINE_DEAD)
return ha;
if (IS_GEN(engine->i915, 2))
return ENGINE_DEAD;
/* Is the chip hanging on a WAIT_FOR_EVENT?
* If so we can simply poke the RB_WAIT bit
* and break the hang. This should work on
* all but the second generation chipsets.
*/
tmp = ENGINE_READ(engine, RING_CTL);
if (tmp & RING_WAIT) {
intel_gt_handle_error(engine->gt, engine->mask, 0,
"stuck wait on %s", engine->name);
ENGINE_WRITE(engine, RING_CTL, tmp);
return ENGINE_WAIT_KICK;
}
return ENGINE_DEAD;
}
static void hangcheck_load_sample(struct intel_engine_cs *engine,
struct hangcheck *hc)
{
hc->acthd = intel_engine_get_active_head(engine);
hc->ring = ENGINE_READ(engine, RING_START);
hc->head = ENGINE_READ(engine, RING_HEAD);
}
static void hangcheck_store_sample(struct intel_engine_cs *engine,
const struct hangcheck *hc)
{
engine->hangcheck.acthd = hc->acthd;
engine->hangcheck.last_ring = hc->ring;
engine->hangcheck.last_head = hc->head;
}
static enum intel_engine_hangcheck_action
hangcheck_get_action(struct intel_engine_cs *engine,
const struct hangcheck *hc)
{
if (intel_engine_is_idle(engine))
return ENGINE_IDLE;
if (engine->hangcheck.last_ring != hc->ring)
return ENGINE_ACTIVE_SEQNO;
if (engine->hangcheck.last_head != hc->head)
return ENGINE_ACTIVE_SEQNO;
return engine_stuck(engine, hc->acthd);
}
static void hangcheck_accumulate_sample(struct intel_engine_cs *engine,
struct hangcheck *hc)
{
unsigned long timeout = I915_ENGINE_DEAD_TIMEOUT;
hc->action = hangcheck_get_action(engine, hc);
/* We always increment the progress
* if the engine is busy and still processing
* the same request, so that no single request
* can run indefinitely (such as a chain of
* batches). The only time we do not increment
* the hangcheck score on this ring, if this
* engine is in a legitimate wait for another
* engine. In that case the waiting engine is a
* victim and we want to be sure we catch the
* right culprit. Then every time we do kick
* the ring, make it as a progress as the seqno
* advancement might ensure and if not, it
* will catch the hanging engine.
*/
switch (hc->action) {
case ENGINE_IDLE:
case ENGINE_ACTIVE_SEQNO:
/* Clear head and subunit states on seqno movement */
hc->acthd = 0;
memset(&engine->hangcheck.instdone, 0,
sizeof(engine->hangcheck.instdone));
/* Intentional fall through */
case ENGINE_WAIT_KICK:
case ENGINE_WAIT:
engine->hangcheck.action_timestamp = jiffies;
break;
case ENGINE_ACTIVE_HEAD:
case ENGINE_ACTIVE_SUBUNITS:
/*
* Seqno stuck with still active engine gets leeway,
* in hopes that it is just a long shader.
*/
timeout = I915_SEQNO_DEAD_TIMEOUT;
break;
case ENGINE_DEAD:
break;
default:
MISSING_CASE(hc->action);
}
hc->stalled = time_after(jiffies,
engine->hangcheck.action_timestamp + timeout);
hc->wedged = time_after(jiffies,
engine->hangcheck.action_timestamp +
I915_ENGINE_WEDGED_TIMEOUT);
}
static void hangcheck_declare_hang(struct intel_gt *gt,
intel_engine_mask_t hung,
intel_engine_mask_t stuck)
{
struct intel_engine_cs *engine;
intel_engine_mask_t tmp;
char msg[80];
int len;
/* If some rings hung but others were still busy, only
* blame the hanging rings in the synopsis.
*/
if (stuck != hung)
hung &= ~stuck;
len = scnprintf(msg, sizeof(msg),
"%s on ", stuck == hung ? "no progress" : "hang");
for_each_engine_masked(engine, gt, hung, tmp)
len += scnprintf(msg + len, sizeof(msg) - len,
"%s, ", engine->name);
msg[len-2] = '\0';
return intel_gt_handle_error(gt, hung, I915_ERROR_CAPTURE, "%s", msg);
}
/*
* This is called when the chip hasn't reported back with completed
* batchbuffers in a long time. We keep track per ring seqno progress and
* if there are no progress, hangcheck score for that ring is increased.
* Further, acthd is inspected to see if the ring is stuck. On stuck case
* we kick the ring. If we see no progress on three subsequent calls
* we assume chip is wedged and try to fix it by resetting the chip.
*/
static void hangcheck_elapsed(struct work_struct *work)
{
struct intel_gt *gt =
container_of(work, typeof(*gt), hangcheck.work.work);
intel_engine_mask_t hung = 0, stuck = 0, wedged = 0;
struct intel_engine_cs *engine;
enum intel_engine_id id;
intel_wakeref_t wakeref;
if (!i915_modparams.enable_hangcheck)
return;
if (!READ_ONCE(gt->awake))
return;
if (intel_gt_is_wedged(gt))
return;
wakeref = intel_runtime_pm_get_if_in_use(gt->uncore->rpm);
if (!wakeref)
return;
/* As enabling the GPU requires fairly extensive mmio access,
* periodically arm the mmio checker to see if we are triggering
* any invalid access.
*/
intel_uncore_arm_unclaimed_mmio_detection(gt->uncore);
for_each_engine(engine, gt, id) {
struct hangcheck hc;
intel_engine_breadcrumbs_irq(engine);
hangcheck_load_sample(engine, &hc);
hangcheck_accumulate_sample(engine, &hc);
hangcheck_store_sample(engine, &hc);
if (hc.stalled) {
hung |= engine->mask;
if (hc.action != ENGINE_DEAD)
stuck |= engine->mask;
}
if (hc.wedged)
wedged |= engine->mask;
}
if (GEM_SHOW_DEBUG() && (hung | stuck)) {
struct drm_printer p = drm_debug_printer("hangcheck");
for_each_engine(engine, gt, id) {
if (intel_engine_is_idle(engine))
continue;
intel_engine_dump(engine, &p, "%s\n", engine->name);
}
}
if (wedged) {
dev_err(gt->i915->drm.dev,
"GPU recovery timed out,"
" cancelling all in-flight rendering.\n");
GEM_TRACE_DUMP();
intel_gt_set_wedged(gt);
}
if (hung)
hangcheck_declare_hang(gt, hung, stuck);
intel_runtime_pm_put(gt->uncore->rpm, wakeref);
/* Reset timer in case GPU hangs without another request being added */
intel_gt_queue_hangcheck(gt);
}
void intel_gt_queue_hangcheck(struct intel_gt *gt)
{
unsigned long delay;
if (unlikely(!i915_modparams.enable_hangcheck))
return;
/*
* Don't continually defer the hangcheck so that it is always run at
* least once after work has been scheduled on any ring. Otherwise,
* we will ignore a hung ring if a second ring is kept busy.
*/
delay = round_jiffies_up_relative(DRM_I915_HANGCHECK_JIFFIES);
queue_delayed_work(system_long_wq, &gt->hangcheck.work, delay);
}
void intel_engine_init_hangcheck(struct intel_engine_cs *engine)
{
memset(&engine->hangcheck, 0, sizeof(engine->hangcheck));
engine->hangcheck.action_timestamp = jiffies;
}
void intel_gt_init_hangcheck(struct intel_gt *gt)
{
INIT_DELAYED_WORK(&gt->hangcheck.work, hangcheck_elapsed);
}
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
#include "selftest_hangcheck.c"
#endif

View File

@ -48,7 +48,7 @@ static bool get_ia_constants(struct intel_llc *llc,
struct ia_constants *consts)
{
struct drm_i915_private *i915 = llc_to_gt(llc)->i915;
struct intel_rps *rps = &i915->gt_pm.rps;
struct intel_rps *rps = &llc_to_gt(llc)->rps;
if (rps->max_freq <= rps->min_freq)
return false;

View File

@ -145,6 +145,7 @@
#include "intel_lrc_reg.h"
#include "intel_mocs.h"
#include "intel_reset.h"
#include "intel_ring.h"
#include "intel_workarounds.h"
#define RING_EXECLIST_QFULL (1 << 0x2)
@ -234,16 +235,9 @@ static void execlists_init_reg_state(u32 *reg_state,
const struct intel_engine_cs *engine,
const struct intel_ring *ring,
bool close);
static void __context_pin_acquire(struct intel_context *ce)
{
mutex_acquire(&ce->pin_mutex.dep_map, 2, 0, _RET_IP_);
}
static void __context_pin_release(struct intel_context *ce)
{
mutex_release(&ce->pin_mutex.dep_map, 0, _RET_IP_);
}
static void
__execlists_update_reg_state(const struct intel_context *ce,
const struct intel_engine_cs *engine);
static void mark_eio(struct i915_request *rq)
{
@ -256,6 +250,23 @@ static void mark_eio(struct i915_request *rq)
i915_request_mark_complete(rq);
}
static struct i915_request *
active_request(const struct intel_timeline * const tl, struct i915_request *rq)
{
struct i915_request *active = rq;
rcu_read_lock();
list_for_each_entry_continue_reverse(rq, &tl->requests, link) {
if (i915_request_completed(rq))
break;
active = rq;
}
rcu_read_unlock();
return active;
}
static inline u32 intel_hws_preempt_address(struct intel_engine_cs *engine)
{
return (i915_ggtt_offset(engine->status_page.vma) +
@ -460,8 +471,7 @@ lrc_descriptor(struct intel_context *ce, struct intel_engine_cs *engine)
if (IS_GEN(engine->i915, 8))
desc |= GEN8_CTX_L3LLC_COHERENT;
desc |= i915_ggtt_offset(ce->state) + LRC_HEADER_PAGES * PAGE_SIZE;
/* bits 12-31 */
desc |= i915_ggtt_offset(ce->state); /* bits 12-31 */
/*
* The following 32bits are copied into the OA reports (dword 2).
* Consider updating oa_get_render_ctx_id in i915_perf.c when changing
@ -925,6 +935,61 @@ execlists_context_status_change(struct i915_request *rq, unsigned long status)
status, rq);
}
static void intel_engine_context_in(struct intel_engine_cs *engine)
{
unsigned long flags;
if (READ_ONCE(engine->stats.enabled) == 0)
return;
write_seqlock_irqsave(&engine->stats.lock, flags);
if (engine->stats.enabled > 0) {
if (engine->stats.active++ == 0)
engine->stats.start = ktime_get();
GEM_BUG_ON(engine->stats.active == 0);
}
write_sequnlock_irqrestore(&engine->stats.lock, flags);
}
static void intel_engine_context_out(struct intel_engine_cs *engine)
{
unsigned long flags;
if (READ_ONCE(engine->stats.enabled) == 0)
return;
write_seqlock_irqsave(&engine->stats.lock, flags);
if (engine->stats.enabled > 0) {
ktime_t last;
if (engine->stats.active && --engine->stats.active == 0) {
/*
* Decrement the active context count and in case GPU
* is now idle add up to the running total.
*/
last = ktime_sub(ktime_get(), engine->stats.start);
engine->stats.total = ktime_add(engine->stats.total,
last);
} else if (engine->stats.active == 0) {
/*
* After turning on engine stats, context out might be
* the first event in which case we account from the
* time stats gathering was turned on.
*/
last = ktime_sub(ktime_get(), engine->stats.enabled_at);
engine->stats.total = ktime_add(engine->stats.total,
last);
}
}
write_sequnlock_irqrestore(&engine->stats.lock, flags);
}
static inline struct intel_engine_cs *
__execlists_schedule_in(struct i915_request *rq)
{
@ -982,6 +1047,59 @@ static void kick_siblings(struct i915_request *rq, struct intel_context *ce)
tasklet_schedule(&ve->base.execlists.tasklet);
}
static void restore_default_state(struct intel_context *ce,
struct intel_engine_cs *engine)
{
u32 *regs = ce->lrc_reg_state;
if (engine->pinned_default_state)
memcpy(regs, /* skip restoring the vanilla PPHWSP */
engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE,
engine->context_size - PAGE_SIZE);
execlists_init_reg_state(regs, ce, engine, ce->ring, false);
}
static void reset_active(struct i915_request *rq,
struct intel_engine_cs *engine)
{
struct intel_context * const ce = rq->hw_context;
u32 head;
/*
* The executing context has been cancelled. We want to prevent
* further execution along this context and propagate the error on
* to anything depending on its results.
*
* In __i915_request_submit(), we apply the -EIO and remove the
* requests' payloads for any banned requests. But first, we must
* rewind the context back to the start of the incomplete request so
* that we do not jump back into the middle of the batch.
*
* We preserve the breadcrumbs and semaphores of the incomplete
* requests so that inter-timeline dependencies (i.e other timelines)
* remain correctly ordered. And we defer to __i915_request_submit()
* so that all asynchronous waits are correctly handled.
*/
GEM_TRACE("%s(%s): { rq=%llx:%lld }\n",
__func__, engine->name, rq->fence.context, rq->fence.seqno);
/* On resubmission of the active request, payload will be scrubbed */
if (i915_request_completed(rq))
head = rq->tail;
else
head = active_request(ce->timeline, rq)->head;
ce->ring->head = intel_ring_wrap(ce->ring, head);
intel_ring_update_space(ce->ring);
/* Scrub the context image to prevent replaying the previous batch */
restore_default_state(ce, engine);
__execlists_update_reg_state(ce, engine);
/* We've switched away, so this should be a no-op, but intent matters */
ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
}
static inline void
__execlists_schedule_out(struct i915_request *rq,
struct intel_engine_cs * const engine)
@ -992,6 +1110,9 @@ __execlists_schedule_out(struct i915_request *rq,
execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
intel_gt_pm_put(engine->gt);
if (unlikely(i915_gem_context_is_banned(ce->gem_context)))
reset_active(rq, engine);
/*
* If this is part of a virtual engine, its next request may
* have been blocked waiting for access to the active context.
@ -1345,7 +1466,7 @@ need_timeslice(struct intel_engine_cs *engine, const struct i915_request *rq)
{
int hint;
if (!intel_engine_has_semaphores(engine))
if (!intel_engine_has_timeslices(engine))
return false;
if (list_is_last(&rq->sched.link, &engine->active.requests))
@ -1366,15 +1487,32 @@ switch_prio(struct intel_engine_cs *engine, const struct i915_request *rq)
return rq_prio(list_next_entry(rq, sched.link));
}
static bool
enable_timeslice(const struct intel_engine_execlists *execlists)
static inline unsigned long
timeslice(const struct intel_engine_cs *engine)
{
const struct i915_request *rq = *execlists->active;
return READ_ONCE(engine->props.timeslice_duration_ms);
}
static unsigned long
active_timeslice(const struct intel_engine_cs *engine)
{
const struct i915_request *rq = *engine->execlists.active;
if (i915_request_completed(rq))
return false;
return 0;
return execlists->switch_priority_hint >= effective_prio(rq);
if (engine->execlists.switch_priority_hint < effective_prio(rq))
return 0;
return timeslice(engine);
}
static void set_timeslice(struct intel_engine_cs *engine)
{
if (!intel_engine_has_timeslices(engine))
return;
set_timer_ms(&engine->execlists.timer, active_timeslice(engine));
}
static void record_preemption(struct intel_engine_execlists *execlists)
@ -1382,6 +1520,30 @@ static void record_preemption(struct intel_engine_execlists *execlists)
(void)I915_SELFTEST_ONLY(execlists->preempt_hang.count++);
}
static unsigned long active_preempt_timeout(struct intel_engine_cs *engine)
{
struct i915_request *rq;
rq = last_active(&engine->execlists);
if (!rq)
return 0;
/* Force a fast reset for terminated contexts (ignoring sysfs!) */
if (unlikely(i915_gem_context_is_banned(rq->gem_context)))
return 1;
return READ_ONCE(engine->props.preempt_timeout_ms);
}
static void set_preempt_timeout(struct intel_engine_cs *engine)
{
if (!intel_engine_has_preempt_reset(engine))
return;
set_timer_ms(&engine->execlists.preempt,
active_preempt_timeout(engine));
}
static void execlists_dequeue(struct intel_engine_cs *engine)
{
struct intel_engine_execlists * const execlists = &engine->execlists;
@ -1521,8 +1683,9 @@ static void execlists_dequeue(struct intel_engine_cs *engine)
*/
if (!execlists->timer.expires &&
need_timeslice(engine, last))
mod_timer(&execlists->timer,
jiffies + 1);
set_timer_ms(&execlists->timer,
timeslice(engine));
return;
}
@ -1757,6 +1920,8 @@ done:
memset(port + 1, 0, (last_port - port) * sizeof(*port));
execlists_submit_ports(engine);
set_preempt_timeout(engine);
} else {
skip_submit:
ring_set_paused(engine, 0);
@ -1867,7 +2032,7 @@ static void process_csb(struct intel_engine_cs *engine)
*/
GEM_BUG_ON(!tasklet_is_locked(&execlists->tasklet) &&
!reset_in_progress(execlists));
GEM_BUG_ON(USES_GUC_SUBMISSION(engine->i915));
GEM_BUG_ON(!intel_engine_in_execlists_submission_mode(engine));
/*
* Note that csb_write, csb_status may be either in HWSP or mmio.
@ -1944,10 +2109,7 @@ static void process_csb(struct intel_engine_cs *engine)
execlists_num_ports(execlists) *
sizeof(*execlists->pending));
if (enable_timeslice(execlists))
mod_timer(&execlists->timer, jiffies + 1);
else
cancel_timer(&execlists->timer);
set_timeslice(engine);
WRITE_ONCE(execlists->pending[0], NULL);
} else {
@ -1997,6 +2159,43 @@ static void __execlists_submission_tasklet(struct intel_engine_cs *const engine)
}
}
static noinline void preempt_reset(struct intel_engine_cs *engine)
{
const unsigned int bit = I915_RESET_ENGINE + engine->id;
unsigned long *lock = &engine->gt->reset.flags;
if (i915_modparams.reset < 3)
return;
if (test_and_set_bit(bit, lock))
return;
/* Mark this tasklet as disabled to avoid waiting for it to complete */
tasklet_disable_nosync(&engine->execlists.tasklet);
GEM_TRACE("%s: preempt timeout %lu+%ums\n",
engine->name,
READ_ONCE(engine->props.preempt_timeout_ms),
jiffies_to_msecs(jiffies - engine->execlists.preempt.expires));
intel_engine_reset(engine, "preemption time out");
tasklet_enable(&engine->execlists.tasklet);
clear_and_wake_up_bit(bit, lock);
}
static bool preempt_timeout(const struct intel_engine_cs *const engine)
{
const struct timer_list *t = &engine->execlists.preempt;
if (!CONFIG_DRM_I915_PREEMPT_TIMEOUT)
return false;
if (!timer_expired(t))
return false;
return READ_ONCE(engine->execlists.pending[0]);
}
/*
* Check the unread Context Status Buffers and manage the submission of new
* contexts to the ELSP accordingly.
@ -2004,23 +2203,39 @@ static void __execlists_submission_tasklet(struct intel_engine_cs *const engine)
static void execlists_submission_tasklet(unsigned long data)
{
struct intel_engine_cs * const engine = (struct intel_engine_cs *)data;
unsigned long flags;
bool timeout = preempt_timeout(engine);
process_csb(engine);
if (!READ_ONCE(engine->execlists.pending[0])) {
if (!READ_ONCE(engine->execlists.pending[0]) || timeout) {
unsigned long flags;
spin_lock_irqsave(&engine->active.lock, flags);
__execlists_submission_tasklet(engine);
spin_unlock_irqrestore(&engine->active.lock, flags);
/* Recheck after serialising with direct-submission */
if (timeout && preempt_timeout(engine))
preempt_reset(engine);
}
}
static void execlists_submission_timer(struct timer_list *timer)
static void __execlists_kick(struct intel_engine_execlists *execlists)
{
struct intel_engine_cs *engine =
from_timer(engine, timer, execlists.timer);
/* Kick the tasklet for some interrupt coalescing and reset handling */
tasklet_hi_schedule(&engine->execlists.tasklet);
tasklet_hi_schedule(&execlists->tasklet);
}
#define execlists_kick(t, member) \
__execlists_kick(container_of(t, struct intel_engine_execlists, member))
static void execlists_timeslice(struct timer_list *timer)
{
execlists_kick(timer, timer);
}
static void execlists_preempt(struct timer_list *timer)
{
execlists_kick(timer, preempt);
}
static void queue_request(struct intel_engine_cs *engine,
@ -2100,7 +2315,6 @@ set_redzone(void *vaddr, const struct intel_engine_cs *engine)
if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
return;
vaddr += LRC_HEADER_PAGES * PAGE_SIZE;
vaddr += engine->context_size;
memset(vaddr, POISON_INUSE, I915_GTT_PAGE_SIZE);
@ -2112,7 +2326,6 @@ check_redzone(const void *vaddr, const struct intel_engine_cs *engine)
if (!IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
return;
vaddr += LRC_HEADER_PAGES * PAGE_SIZE;
vaddr += engine->context_size;
if (memchr_inv(vaddr, POISON_INUSE, I915_GTT_PAGE_SIZE))
@ -2727,37 +2940,28 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
&execlists->csb_status[reset_value]);
}
static struct i915_request *active_request(struct i915_request *rq)
static int lrc_ring_mi_mode(const struct intel_engine_cs *engine)
{
const struct intel_context * const ce = rq->hw_context;
struct i915_request *active = NULL;
struct list_head *list;
if (!i915_request_is_active(rq)) /* unwound, but incomplete! */
return rq;
list = &i915_request_active_timeline(rq)->requests;
list_for_each_entry_from_reverse(rq, list, link) {
if (i915_request_completed(rq))
break;
if (rq->hw_context != ce)
break;
active = rq;
}
return active;
if (INTEL_GEN(engine->i915) >= 12)
return 0x60;
else if (INTEL_GEN(engine->i915) >= 9)
return 0x54;
else if (engine->class == RENDER_CLASS)
return 0x58;
else
return -1;
}
static void __execlists_reset_reg_state(const struct intel_context *ce,
const struct intel_engine_cs *engine)
{
u32 *regs = ce->lrc_reg_state;
int x;
if (INTEL_GEN(engine->i915) >= 9) {
regs[GEN9_CTX_RING_MI_MODE + 1] &= ~STOP_RING;
regs[GEN9_CTX_RING_MI_MODE + 1] |= STOP_RING << 16;
x = lrc_ring_mi_mode(engine);
if (x != -1) {
regs[x + 1] &= ~STOP_RING;
regs[x + 1] |= STOP_RING << 16;
}
}
@ -2766,7 +2970,6 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
struct intel_engine_execlists * const execlists = &engine->execlists;
struct intel_context *ce;
struct i915_request *rq;
u32 *regs;
mb(); /* paranoia: read the CSB pointers from after the reset */
clflush(execlists->csb_write);
@ -2792,19 +2995,17 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
ce = rq->hw_context;
GEM_BUG_ON(!i915_vma_is_pinned(ce->state));
/* Proclaim we have exclusive access to the context image! */
__context_pin_acquire(ce);
rq = active_request(rq);
if (!rq) {
if (i915_request_completed(rq)) {
/* Idle context; tidy up the ring so we can restart afresh */
ce->ring->head = ce->ring->tail;
ce->ring->head = intel_ring_wrap(ce->ring, rq->tail);
goto out_replay;
}
/* Context has requests still in-flight; it should not be idle! */
GEM_BUG_ON(i915_active_is_idle(&ce->active));
rq = active_request(ce->timeline, rq);
ce->ring->head = intel_ring_wrap(ce->ring, rq->head);
GEM_BUG_ON(ce->ring->head == ce->ring->tail);
/*
* If this request hasn't started yet, e.g. it is waiting on a
@ -2845,22 +3046,15 @@ static void __execlists_reset(struct intel_engine_cs *engine, bool stalled)
* to recreate its own state.
*/
GEM_BUG_ON(!intel_context_is_pinned(ce));
regs = ce->lrc_reg_state;
if (engine->pinned_default_state) {
memcpy(regs, /* skip restoring the vanilla PPHWSP */
engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE,
engine->context_size - PAGE_SIZE);
}
execlists_init_reg_state(regs, ce, engine, ce->ring, false);
restore_default_state(ce, engine);
out_replay:
GEM_TRACE("%s replay {head:%04x, tail:%04x\n",
GEM_TRACE("%s replay {head:%04x, tail:%04x}\n",
engine->name, ce->ring->head, ce->ring->tail);
intel_ring_update_space(ce->ring);
__execlists_reset_reg_state(ce, engine);
__execlists_update_reg_state(ce, engine);
ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
__context_pin_release(ce);
unwind:
/* Push back any incomplete requests for replay after the reset. */
@ -3469,6 +3663,7 @@ gen12_emit_fini_breadcrumb_rcs(struct i915_request *request, u32 *cs)
static void execlists_park(struct intel_engine_cs *engine)
{
cancel_timer(&engine->execlists.timer);
cancel_timer(&engine->execlists.preempt);
}
void intel_execlists_set_default_submission(struct intel_engine_cs *engine)
@ -3586,7 +3781,8 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine)
{
tasklet_init(&engine->execlists.tasklet,
execlists_submission_tasklet, (unsigned long)engine);
timer_setup(&engine->execlists.timer, execlists_submission_timer, 0);
timer_setup(&engine->execlists.timer, execlists_timeslice, 0);
timer_setup(&engine->execlists.preempt, execlists_preempt, 0);
logical_ring_default_vfuncs(engine);
logical_ring_default_irqs(engine);
@ -3796,12 +3992,6 @@ populate_lr_context(struct intel_context *ce,
set_redzone(vaddr, engine);
if (engine->default_state) {
/*
* We only want to copy over the template context state;
* skipping over the headers reserved for GuC communication,
* leaving those as zero.
*/
const unsigned long start = LRC_HEADER_PAGES * PAGE_SIZE;
void *defaults;
defaults = i915_gem_object_pin_map(engine->default_state,
@ -3811,7 +4001,7 @@ populate_lr_context(struct intel_context *ce,
goto err_unpin_ctx;
}
memcpy(vaddr + start, defaults + start, engine->context_size);
memcpy(vaddr, defaults, engine->context_size);
i915_gem_object_unpin_map(engine->default_state);
inhibit = false;
}
@ -3826,9 +4016,7 @@ populate_lr_context(struct intel_context *ce,
ret = 0;
err_unpin_ctx:
__i915_gem_object_flush_map(ctx_obj,
LRC_HEADER_PAGES * PAGE_SIZE,
engine->context_size);
__i915_gem_object_flush_map(ctx_obj, 0, engine->context_size);
i915_gem_object_unpin_map(ctx_obj);
return ret;
}
@ -3845,11 +4033,6 @@ static int __execlists_context_alloc(struct intel_context *ce,
GEM_BUG_ON(ce->state);
context_size = round_up(engine->context_size, I915_GTT_PAGE_SIZE);
/*
* Before the actual start of the context image, we insert a few pages
* for our own use and for sharing with the GuC.
*/
context_size += LRC_HEADER_PAGES * PAGE_SIZE;
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
context_size += I915_GTT_PAGE_SIZE; /* for redzone */
@ -4502,7 +4685,6 @@ void intel_lr_context_reset(struct intel_engine_cs *engine,
bool scrub)
{
GEM_BUG_ON(!intel_context_is_pinned(ce));
__context_pin_acquire(ce);
/*
* We want a simple context + ring to execute the breadcrumb update.
@ -4512,23 +4694,21 @@ void intel_lr_context_reset(struct intel_engine_cs *engine,
* future request will be after userspace has had the opportunity
* to recreate its own state.
*/
if (scrub) {
u32 *regs = ce->lrc_reg_state;
if (engine->pinned_default_state) {
memcpy(regs, /* skip restoring the vanilla PPHWSP */
engine->pinned_default_state + LRC_STATE_PN * PAGE_SIZE,
engine->context_size - PAGE_SIZE);
}
execlists_init_reg_state(regs, ce, engine, ce->ring, false);
}
if (scrub)
restore_default_state(ce, engine);
/* Rerun the request; its payload has been neutered (if guilty). */
ce->ring->head = head;
intel_ring_update_space(ce->ring);
__execlists_update_reg_state(ce, engine);
__context_pin_release(ce);
}
bool
intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine)
{
return engine->set_default_submission ==
intel_execlists_set_default_submission;
}
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)

View File

@ -43,6 +43,7 @@ struct intel_engine_cs;
#define CTX_CTRL_ENGINE_CTX_RESTORE_INHIBIT (1 << 0)
#define CTX_CTRL_RS_CTX_ENABLE (1 << 1)
#define CTX_CTRL_ENGINE_CTX_SAVE_INHIBIT (1 << 2)
#define GEN12_CTX_CTRL_OAR_CONTEXT_ENABLE (1 << 8)
#define RING_CONTEXT_STATUS_PTR(base) _MMIO((base) + 0x3a0)
#define RING_EXECLIST_SQ_CONTENTS(base) _MMIO((base) + 0x510)
#define RING_EXECLIST_CONTROL(base) _MMIO((base) + 0x550)
@ -85,31 +86,12 @@ int intel_execlists_submission_setup(struct intel_engine_cs *engine);
int intel_execlists_submission_init(struct intel_engine_cs *engine);
/* Logical Ring Contexts */
/*
* We allocate a header at the start of the context image for our own
* use, therefore the actual location of the logical state is offset
* from the start of the VMA. The layout is
*
* | [guc] | [hwsp] [logical state] |
* |<- our header ->|<- context image ->|
*
*/
/* The first page is used for sharing data with the GuC */
#define LRC_GUCSHR_PN (0)
#define LRC_GUCSHR_SZ (1)
/* At the start of the context image is its per-process HWS page */
#define LRC_PPHWSP_PN (LRC_GUCSHR_PN + LRC_GUCSHR_SZ)
#define LRC_PPHWSP_PN (0)
#define LRC_PPHWSP_SZ (1)
/* Finally we have the logical state for the context */
/* After the PPHWSP we have the logical state for the context */
#define LRC_STATE_PN (LRC_PPHWSP_PN + LRC_PPHWSP_SZ)
/*
* Currently we include the PPHWSP in __intel_engine_context_size() so
* the size of the header is synonymous with the start of the PPHWSP.
*/
#define LRC_HEADER_PAGES LRC_PPHWSP_PN
/* Space within PPHWSP reserved to be used as scratch */
#define LRC_PPHWSP_SCRATCH 0x34
#define LRC_PPHWSP_SCRATCH_ADDR (LRC_PPHWSP_SCRATCH * sizeof(u32))
@ -145,4 +127,7 @@ struct intel_engine_cs *
intel_virtual_engine_get_sibling(struct intel_engine_cs *engine,
unsigned int sibling);
bool
intel_engine_in_execlists_submission_mode(const struct intel_engine_cs *engine);
#endif /* _INTEL_LRC_H_ */

View File

@ -26,6 +26,7 @@
#include "intel_gt.h"
#include "intel_mocs.h"
#include "intel_lrc.h"
#include "intel_ring.h"
/* structures required */
struct drm_i915_mocs_entry {
@ -461,6 +462,12 @@ static void intel_mocs_init_global(struct intel_gt *gt)
struct drm_i915_mocs_table table;
unsigned int index;
/*
* LLC and eDRAM control values are not applicable to dgfx
*/
if (IS_DGFX(gt->i915))
return;
GEM_BUG_ON(!HAS_GLOBAL_MOCS_REGISTERS(gt->i915));
if (!get_mocs_settings(gt->i915, &table))

View File

@ -27,6 +27,7 @@
#include "i915_drv.h"
#include "intel_renderstate.h"
#include "intel_ring.h"
struct intel_renderstate {
const struct intel_renderstate_rodata *rodata;

View File

@ -1024,8 +1024,6 @@ void intel_gt_reset(struct intel_gt *gt,
if (ret)
goto taint;
intel_gt_queue_hangcheck(gt);
finish:
reset_finish(gt, awake);
unlock:
@ -1353,4 +1351,5 @@ void __intel_fini_wedge(struct intel_wedge_me *w)
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
#include "selftest_reset.c"
#include "selftest_hangcheck.c"
#endif

View File

@ -0,0 +1,323 @@
/*
* SPDX-License-Identifier: MIT
*
* Copyright © 2019 Intel Corporation
*/
#include "gem/i915_gem_object.h"
#include "i915_drv.h"
#include "i915_vma.h"
#include "intel_engine.h"
#include "intel_ring.h"
#include "intel_timeline.h"
unsigned int intel_ring_update_space(struct intel_ring *ring)
{
unsigned int space;
space = __intel_ring_space(ring->head, ring->emit, ring->size);
ring->space = space;
return space;
}
int intel_ring_pin(struct intel_ring *ring)
{
struct i915_vma *vma = ring->vma;
unsigned int flags;
void *addr;
int ret;
if (atomic_fetch_inc(&ring->pin_count))
return 0;
flags = PIN_GLOBAL;
/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
flags |= PIN_OFFSET_BIAS | i915_ggtt_pin_bias(vma);
if (vma->obj->stolen)
flags |= PIN_MAPPABLE;
else
flags |= PIN_HIGH;
ret = i915_vma_pin(vma, 0, 0, flags);
if (unlikely(ret))
goto err_unpin;
if (i915_vma_is_map_and_fenceable(vma))
addr = (void __force *)i915_vma_pin_iomap(vma);
else
addr = i915_gem_object_pin_map(vma->obj,
i915_coherent_map_type(vma->vm->i915));
if (IS_ERR(addr)) {
ret = PTR_ERR(addr);
goto err_ring;
}
i915_vma_make_unshrinkable(vma);
GEM_BUG_ON(ring->vaddr);
ring->vaddr = addr;
return 0;
err_ring:
i915_vma_unpin(vma);
err_unpin:
atomic_dec(&ring->pin_count);
return ret;
}
void intel_ring_reset(struct intel_ring *ring, u32 tail)
{
tail = intel_ring_wrap(ring, tail);
ring->tail = tail;
ring->head = tail;
ring->emit = tail;
intel_ring_update_space(ring);
}
void intel_ring_unpin(struct intel_ring *ring)
{
struct i915_vma *vma = ring->vma;
if (!atomic_dec_and_test(&ring->pin_count))
return;
/* Discard any unused bytes beyond that submitted to hw. */
intel_ring_reset(ring, ring->emit);
i915_vma_unset_ggtt_write(vma);
if (i915_vma_is_map_and_fenceable(vma))
i915_vma_unpin_iomap(vma);
else
i915_gem_object_unpin_map(vma->obj);
GEM_BUG_ON(!ring->vaddr);
ring->vaddr = NULL;
i915_vma_unpin(vma);
i915_vma_make_purgeable(vma);
}
static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
{
struct i915_address_space *vm = &ggtt->vm;
struct drm_i915_private *i915 = vm->i915;
struct drm_i915_gem_object *obj;
struct i915_vma *vma;
obj = ERR_PTR(-ENODEV);
if (i915_ggtt_has_aperture(ggtt))
obj = i915_gem_object_create_stolen(i915, size);
if (IS_ERR(obj))
obj = i915_gem_object_create_internal(i915, size);
if (IS_ERR(obj))
return ERR_CAST(obj);
/*
* Mark ring buffers as read-only from GPU side (so no stray overwrites)
* if supported by the platform's GGTT.
*/
if (vm->has_read_only)
i915_gem_object_set_readonly(obj);
vma = i915_vma_instance(obj, vm, NULL);
if (IS_ERR(vma))
goto err;
return vma;
err:
i915_gem_object_put(obj);
return vma;
}
struct intel_ring *
intel_engine_create_ring(struct intel_engine_cs *engine, int size)
{
struct drm_i915_private *i915 = engine->i915;
struct intel_ring *ring;
struct i915_vma *vma;
GEM_BUG_ON(!is_power_of_2(size));
GEM_BUG_ON(RING_CTL_SIZE(size) & ~RING_NR_PAGES);
ring = kzalloc(sizeof(*ring), GFP_KERNEL);
if (!ring)
return ERR_PTR(-ENOMEM);
kref_init(&ring->ref);
ring->size = size;
/*
* Workaround an erratum on the i830 which causes a hang if
* the TAIL pointer points to within the last 2 cachelines
* of the buffer.
*/
ring->effective_size = size;
if (IS_I830(i915) || IS_I845G(i915))
ring->effective_size -= 2 * CACHELINE_BYTES;
intel_ring_update_space(ring);
vma = create_ring_vma(engine->gt->ggtt, size);
if (IS_ERR(vma)) {
kfree(ring);
return ERR_CAST(vma);
}
ring->vma = vma;
return ring;
}
void intel_ring_free(struct kref *ref)
{
struct intel_ring *ring = container_of(ref, typeof(*ring), ref);
i915_vma_put(ring->vma);
kfree(ring);
}
static noinline int
wait_for_space(struct intel_ring *ring,
struct intel_timeline *tl,
unsigned int bytes)
{
struct i915_request *target;
long timeout;
if (intel_ring_update_space(ring) >= bytes)
return 0;
GEM_BUG_ON(list_empty(&tl->requests));
list_for_each_entry(target, &tl->requests, link) {
if (target->ring != ring)
continue;
/* Would completion of this request free enough space? */
if (bytes <= __intel_ring_space(target->postfix,
ring->emit, ring->size))
break;
}
if (GEM_WARN_ON(&target->link == &tl->requests))
return -ENOSPC;
timeout = i915_request_wait(target,
I915_WAIT_INTERRUPTIBLE,
MAX_SCHEDULE_TIMEOUT);
if (timeout < 0)
return timeout;
i915_request_retire_upto(target);
intel_ring_update_space(ring);
GEM_BUG_ON(ring->space < bytes);
return 0;
}
u32 *intel_ring_begin(struct i915_request *rq, unsigned int num_dwords)
{
struct intel_ring *ring = rq->ring;
const unsigned int remain_usable = ring->effective_size - ring->emit;
const unsigned int bytes = num_dwords * sizeof(u32);
unsigned int need_wrap = 0;
unsigned int total_bytes;
u32 *cs;
/* Packets must be qword aligned. */
GEM_BUG_ON(num_dwords & 1);
total_bytes = bytes + rq->reserved_space;
GEM_BUG_ON(total_bytes > ring->effective_size);
if (unlikely(total_bytes > remain_usable)) {
const int remain_actual = ring->size - ring->emit;
if (bytes > remain_usable) {
/*
* Not enough space for the basic request. So need to
* flush out the remainder and then wait for
* base + reserved.
*/
total_bytes += remain_actual;
need_wrap = remain_actual | 1;
} else {
/*
* The base request will fit but the reserved space
* falls off the end. So we don't need an immediate
* wrap and only need to effectively wait for the
* reserved size from the start of ringbuffer.
*/
total_bytes = rq->reserved_space + remain_actual;
}
}
if (unlikely(total_bytes > ring->space)) {
int ret;
/*
* Space is reserved in the ringbuffer for finalising the
* request, as that cannot be allowed to fail. During request
* finalisation, reserved_space is set to 0 to stop the
* overallocation and the assumption is that then we never need
* to wait (which has the risk of failing with EINTR).
*
* See also i915_request_alloc() and i915_request_add().
*/
GEM_BUG_ON(!rq->reserved_space);
ret = wait_for_space(ring,
i915_request_timeline(rq),
total_bytes);
if (unlikely(ret))
return ERR_PTR(ret);
}
if (unlikely(need_wrap)) {
need_wrap &= ~1;
GEM_BUG_ON(need_wrap > ring->space);
GEM_BUG_ON(ring->emit + need_wrap > ring->size);
GEM_BUG_ON(!IS_ALIGNED(need_wrap, sizeof(u64)));
/* Fill the tail with MI_NOOP */
memset64(ring->vaddr + ring->emit, 0, need_wrap / sizeof(u64));
ring->space -= need_wrap;
ring->emit = 0;
}
GEM_BUG_ON(ring->emit > ring->size - bytes);
GEM_BUG_ON(ring->space < bytes);
cs = ring->vaddr + ring->emit;
GEM_DEBUG_EXEC(memset32(cs, POISON_INUSE, bytes / sizeof(*cs)));
ring->emit += bytes;
ring->space -= bytes;
return cs;
}
/* Align the ring tail to a cacheline boundary */
int intel_ring_cacheline_align(struct i915_request *rq)
{
int num_dwords;
void *cs;
num_dwords = (rq->ring->emit & (CACHELINE_BYTES - 1)) / sizeof(u32);
if (num_dwords == 0)
return 0;
num_dwords = CACHELINE_DWORDS - num_dwords;
GEM_BUG_ON(num_dwords & 1);
cs = intel_ring_begin(rq, num_dwords);
if (IS_ERR(cs))
return PTR_ERR(cs);
memset64(cs, (u64)MI_NOOP << 32 | MI_NOOP, num_dwords / 2);
intel_ring_advance(rq, cs + num_dwords);
GEM_BUG_ON(rq->ring->emit & (CACHELINE_BYTES - 1));
return 0;
}

View File

@ -0,0 +1,131 @@
/*
* SPDX-License-Identifier: MIT
*
* Copyright © 2019 Intel Corporation
*/
#ifndef INTEL_RING_H
#define INTEL_RING_H
#include "i915_gem.h" /* GEM_BUG_ON */
#include "i915_request.h"
#include "intel_ring_types.h"
struct intel_engine_cs;
struct intel_ring *
intel_engine_create_ring(struct intel_engine_cs *engine, int size);
u32 *intel_ring_begin(struct i915_request *rq, unsigned int num_dwords);
int intel_ring_cacheline_align(struct i915_request *rq);
unsigned int intel_ring_update_space(struct intel_ring *ring);
int intel_ring_pin(struct intel_ring *ring);
void intel_ring_unpin(struct intel_ring *ring);
void intel_ring_reset(struct intel_ring *ring, u32 tail);
void intel_ring_free(struct kref *ref);
static inline struct intel_ring *intel_ring_get(struct intel_ring *ring)
{
kref_get(&ring->ref);
return ring;
}
static inline void intel_ring_put(struct intel_ring *ring)
{
kref_put(&ring->ref, intel_ring_free);
}
static inline void intel_ring_advance(struct i915_request *rq, u32 *cs)
{
/* Dummy function.
*
* This serves as a placeholder in the code so that the reader
* can compare against the preceding intel_ring_begin() and
* check that the number of dwords emitted matches the space
* reserved for the command packet (i.e. the value passed to
* intel_ring_begin()).
*/
GEM_BUG_ON((rq->ring->vaddr + rq->ring->emit) != cs);
}
static inline u32 intel_ring_wrap(const struct intel_ring *ring, u32 pos)
{
return pos & (ring->size - 1);
}
static inline bool
intel_ring_offset_valid(const struct intel_ring *ring,
unsigned int pos)
{
if (pos & -ring->size) /* must be strictly within the ring */
return false;
if (!IS_ALIGNED(pos, 8)) /* must be qword aligned */
return false;
return true;
}
static inline u32 intel_ring_offset(const struct i915_request *rq, void *addr)
{
/* Don't write ring->size (equivalent to 0) as that hangs some GPUs. */
u32 offset = addr - rq->ring->vaddr;
GEM_BUG_ON(offset > rq->ring->size);
return intel_ring_wrap(rq->ring, offset);
}
static inline void
assert_ring_tail_valid(const struct intel_ring *ring, unsigned int tail)
{
GEM_BUG_ON(!intel_ring_offset_valid(ring, tail));
/*
* "Ring Buffer Use"
* Gen2 BSpec "1. Programming Environment" / 1.4.4.6
* Gen3 BSpec "1c Memory Interface Functions" / 2.3.4.5
* Gen4+ BSpec "1c Memory Interface and Command Stream" / 5.3.4.5
* "If the Ring Buffer Head Pointer and the Tail Pointer are on the
* same cacheline, the Head Pointer must not be greater than the Tail
* Pointer."
*
* We use ring->head as the last known location of the actual RING_HEAD,
* it may have advanced but in the worst case it is equally the same
* as ring->head and so we should never program RING_TAIL to advance
* into the same cacheline as ring->head.
*/
#define cacheline(a) round_down(a, CACHELINE_BYTES)
GEM_BUG_ON(cacheline(tail) == cacheline(ring->head) &&
tail < ring->head);
#undef cacheline
}
static inline unsigned int
intel_ring_set_tail(struct intel_ring *ring, unsigned int tail)
{
/* Whilst writes to the tail are strictly order, there is no
* serialisation between readers and the writers. The tail may be
* read by i915_request_retire() just as it is being updated
* by execlists, as although the breadcrumb is complete, the context
* switch hasn't been seen.
*/
assert_ring_tail_valid(ring, tail);
ring->tail = tail;
return tail;
}
static inline unsigned int
__intel_ring_space(unsigned int head, unsigned int tail, unsigned int size)
{
/*
* "If the Ring Buffer Head Pointer and the Tail Pointer are on the
* same cacheline, the Head Pointer must not be greater than the Tail
* Pointer."
*/
GEM_BUG_ON(!is_power_of_2(size));
return (head - tail - CACHELINE_BYTES) & (size - 1);
}
#endif /* INTEL_RING_H */

View File

@ -40,6 +40,7 @@
#include "intel_gt_irq.h"
#include "intel_gt_pm_irq.h"
#include "intel_reset.h"
#include "intel_ring.h"
#include "intel_workarounds.h"
/* Rough estimate of the typical request size, performing a flush,
@ -47,16 +48,6 @@
*/
#define LEGACY_REQUEST_SIZE 200
unsigned int intel_ring_update_space(struct intel_ring *ring)
{
unsigned int space;
space = __intel_ring_space(ring->head, ring->emit, ring->size);
ring->space = space;
return space;
}
static int
gen2_render_ring_flush(struct i915_request *rq, u32 mode)
{
@ -1186,162 +1177,6 @@ i915_emit_bb_start(struct i915_request *rq,
return 0;
}
int intel_ring_pin(struct intel_ring *ring)
{
struct i915_vma *vma = ring->vma;
unsigned int flags;
void *addr;
int ret;
if (atomic_fetch_inc(&ring->pin_count))
return 0;
flags = PIN_GLOBAL;
/* Ring wraparound at offset 0 sometimes hangs. No idea why. */
flags |= PIN_OFFSET_BIAS | i915_ggtt_pin_bias(vma);
if (vma->obj->stolen)
flags |= PIN_MAPPABLE;
else
flags |= PIN_HIGH;
ret = i915_vma_pin(vma, 0, 0, flags);
if (unlikely(ret))
goto err_unpin;
if (i915_vma_is_map_and_fenceable(vma))
addr = (void __force *)i915_vma_pin_iomap(vma);
else
addr = i915_gem_object_pin_map(vma->obj,
i915_coherent_map_type(vma->vm->i915));
if (IS_ERR(addr)) {
ret = PTR_ERR(addr);
goto err_ring;
}
i915_vma_make_unshrinkable(vma);
GEM_BUG_ON(ring->vaddr);
ring->vaddr = addr;
return 0;
err_ring:
i915_vma_unpin(vma);
err_unpin:
atomic_dec(&ring->pin_count);
return ret;
}
void intel_ring_reset(struct intel_ring *ring, u32 tail)
{
tail = intel_ring_wrap(ring, tail);
ring->tail = tail;
ring->head = tail;
ring->emit = tail;
intel_ring_update_space(ring);
}
void intel_ring_unpin(struct intel_ring *ring)
{
struct i915_vma *vma = ring->vma;
if (!atomic_dec_and_test(&ring->pin_count))
return;
/* Discard any unused bytes beyond that submitted to hw. */
intel_ring_reset(ring, ring->emit);
i915_vma_unset_ggtt_write(vma);
if (i915_vma_is_map_and_fenceable(vma))
i915_vma_unpin_iomap(vma);
else
i915_gem_object_unpin_map(vma->obj);
GEM_BUG_ON(!ring->vaddr);
ring->vaddr = NULL;
i915_vma_unpin(vma);
i915_vma_make_purgeable(vma);
}
static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
{
struct i915_address_space *vm = &ggtt->vm;
struct drm_i915_private *i915 = vm->i915;
struct drm_i915_gem_object *obj;
struct i915_vma *vma;
obj = i915_gem_object_create_stolen(i915, size);
if (IS_ERR(obj))
obj = i915_gem_object_create_internal(i915, size);
if (IS_ERR(obj))
return ERR_CAST(obj);
/*
* Mark ring buffers as read-only from GPU side (so no stray overwrites)
* if supported by the platform's GGTT.
*/
if (vm->has_read_only)
i915_gem_object_set_readonly(obj);
vma = i915_vma_instance(obj, vm, NULL);
if (IS_ERR(vma))
goto err;
return vma;
err:
i915_gem_object_put(obj);
return vma;
}
struct intel_ring *
intel_engine_create_ring(struct intel_engine_cs *engine, int size)
{
struct drm_i915_private *i915 = engine->i915;
struct intel_ring *ring;
struct i915_vma *vma;
GEM_BUG_ON(!is_power_of_2(size));
GEM_BUG_ON(RING_CTL_SIZE(size) & ~RING_NR_PAGES);
ring = kzalloc(sizeof(*ring), GFP_KERNEL);
if (!ring)
return ERR_PTR(-ENOMEM);
kref_init(&ring->ref);
ring->size = size;
/* Workaround an erratum on the i830 which causes a hang if
* the TAIL pointer points to within the last 2 cachelines
* of the buffer.
*/
ring->effective_size = size;
if (IS_I830(i915) || IS_I845G(i915))
ring->effective_size -= 2 * CACHELINE_BYTES;
intel_ring_update_space(ring);
vma = create_ring_vma(engine->gt->ggtt, size);
if (IS_ERR(vma)) {
kfree(ring);
return ERR_CAST(vma);
}
ring->vma = vma;
return ring;
}
void intel_ring_free(struct kref *ref)
{
struct intel_ring *ring = container_of(ref, typeof(*ring), ref);
i915_vma_put(ring->vma);
kfree(ring);
}
static void __ring_context_fini(struct intel_context *ce)
{
i915_vma_put(ce->state);
@ -1836,148 +1671,6 @@ static int ring_request_alloc(struct i915_request *request)
return 0;
}
static noinline int
wait_for_space(struct intel_ring *ring,
struct intel_timeline *tl,
unsigned int bytes)
{
struct i915_request *target;
long timeout;
if (intel_ring_update_space(ring) >= bytes)
return 0;
GEM_BUG_ON(list_empty(&tl->requests));
list_for_each_entry(target, &tl->requests, link) {
if (target->ring != ring)
continue;
/* Would completion of this request free enough space? */
if (bytes <= __intel_ring_space(target->postfix,
ring->emit, ring->size))
break;
}
if (GEM_WARN_ON(&target->link == &tl->requests))
return -ENOSPC;
timeout = i915_request_wait(target,
I915_WAIT_INTERRUPTIBLE,
MAX_SCHEDULE_TIMEOUT);
if (timeout < 0)
return timeout;
i915_request_retire_upto(target);
intel_ring_update_space(ring);
GEM_BUG_ON(ring->space < bytes);
return 0;
}
u32 *intel_ring_begin(struct i915_request *rq, unsigned int num_dwords)
{
struct intel_ring *ring = rq->ring;
const unsigned int remain_usable = ring->effective_size - ring->emit;
const unsigned int bytes = num_dwords * sizeof(u32);
unsigned int need_wrap = 0;
unsigned int total_bytes;
u32 *cs;
/* Packets must be qword aligned. */
GEM_BUG_ON(num_dwords & 1);
total_bytes = bytes + rq->reserved_space;
GEM_BUG_ON(total_bytes > ring->effective_size);
if (unlikely(total_bytes > remain_usable)) {
const int remain_actual = ring->size - ring->emit;
if (bytes > remain_usable) {
/*
* Not enough space for the basic request. So need to
* flush out the remainder and then wait for
* base + reserved.
*/
total_bytes += remain_actual;
need_wrap = remain_actual | 1;
} else {
/*
* The base request will fit but the reserved space
* falls off the end. So we don't need an immediate
* wrap and only need to effectively wait for the
* reserved size from the start of ringbuffer.
*/
total_bytes = rq->reserved_space + remain_actual;
}
}
if (unlikely(total_bytes > ring->space)) {
int ret;
/*
* Space is reserved in the ringbuffer for finalising the
* request, as that cannot be allowed to fail. During request
* finalisation, reserved_space is set to 0 to stop the
* overallocation and the assumption is that then we never need
* to wait (which has the risk of failing with EINTR).
*
* See also i915_request_alloc() and i915_request_add().
*/
GEM_BUG_ON(!rq->reserved_space);
ret = wait_for_space(ring,
i915_request_timeline(rq),
total_bytes);
if (unlikely(ret))
return ERR_PTR(ret);
}
if (unlikely(need_wrap)) {
need_wrap &= ~1;
GEM_BUG_ON(need_wrap > ring->space);
GEM_BUG_ON(ring->emit + need_wrap > ring->size);
GEM_BUG_ON(!IS_ALIGNED(need_wrap, sizeof(u64)));
/* Fill the tail with MI_NOOP */
memset64(ring->vaddr + ring->emit, 0, need_wrap / sizeof(u64));
ring->space -= need_wrap;
ring->emit = 0;
}
GEM_BUG_ON(ring->emit > ring->size - bytes);
GEM_BUG_ON(ring->space < bytes);
cs = ring->vaddr + ring->emit;
GEM_DEBUG_EXEC(memset32(cs, POISON_INUSE, bytes / sizeof(*cs)));
ring->emit += bytes;
ring->space -= bytes;
return cs;
}
/* Align the ring tail to a cacheline boundary */
int intel_ring_cacheline_align(struct i915_request *rq)
{
int num_dwords;
void *cs;
num_dwords = (rq->ring->emit & (CACHELINE_BYTES - 1)) / sizeof(u32);
if (num_dwords == 0)
return 0;
num_dwords = CACHELINE_DWORDS - num_dwords;
GEM_BUG_ON(num_dwords & 1);
cs = intel_ring_begin(rq, num_dwords);
if (IS_ERR(cs))
return PTR_ERR(cs);
memset64(cs, (u64)MI_NOOP << 32 | MI_NOOP, num_dwords / 2);
intel_ring_advance(rq, cs);
GEM_BUG_ON(rq->ring->emit & (CACHELINE_BYTES - 1));
return 0;
}
static void gen6_bsd_submit_request(struct i915_request *request)
{
struct intel_uncore *uncore = request->engine->uncore;

View File

@ -0,0 +1,51 @@
/*
* SPDX-License-Identifier: MIT
*
* Copyright © 2019 Intel Corporation
*/
#ifndef INTEL_RING_TYPES_H
#define INTEL_RING_TYPES_H
#include <linux/atomic.h>
#include <linux/kref.h>
#include <linux/types.h>
/*
* Early gen2 devices have a cacheline of just 32 bytes, using 64 is overkill,
* but keeps the logic simple. Indeed, the whole purpose of this macro is just
* to give some inclination as to some of the magic values used in the various
* workarounds!
*/
#define CACHELINE_BYTES 64
#define CACHELINE_DWORDS (CACHELINE_BYTES / sizeof(u32))
struct i915_vma;
struct intel_ring {
struct kref ref;
struct i915_vma *vma;
void *vaddr;
/*
* As we have two types of rings, one global to the engine used
* by ringbuffer submission and those that are exclusive to a
* context used by execlists, we have to play safe and allow
* atomic updates to the pin_count. However, the actual pinning
* of the context is either done during initialisation for
* ringbuffer submission or serialised as part of the context
* pinning for execlists, and so we do not need a mutex ourselves
* to serialise intel_ring_pin/intel_ring_unpin.
*/
atomic_t pin_count;
u32 head;
u32 tail;
u32 emit;
u32 space;
u32 size;
u32 effective_size;
};
#endif /* INTEL_RING_TYPES_H */

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,38 @@
/*
* SPDX-License-Identifier: MIT
*
* Copyright © 2019 Intel Corporation
*/
#ifndef INTEL_RPS_H
#define INTEL_RPS_H
#include "intel_rps_types.h"
struct i915_request;
void intel_rps_init_early(struct intel_rps *rps);
void intel_rps_init(struct intel_rps *rps);
void intel_rps_driver_register(struct intel_rps *rps);
void intel_rps_driver_unregister(struct intel_rps *rps);
void intel_rps_enable(struct intel_rps *rps);
void intel_rps_disable(struct intel_rps *rps);
void intel_rps_park(struct intel_rps *rps);
void intel_rps_unpark(struct intel_rps *rps);
void intel_rps_boost(struct i915_request *rq);
int intel_rps_set(struct intel_rps *rps, u8 val);
void intel_rps_mark_interactive(struct intel_rps *rps, bool interactive);
int intel_gpu_freq(struct intel_rps *rps, int val);
int intel_freq_opcode(struct intel_rps *rps, int val);
u32 intel_get_cagf(struct intel_rps *rps, u32 rpstat1);
void gen5_rps_irq_handler(struct intel_rps *rps);
void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir);
void gen11_rps_irq_handler(struct intel_rps *rps, u32 pm_iir);
#endif /* INTEL_RPS_H */

View File

@ -0,0 +1,93 @@
/*
* SPDX-License-Identifier: MIT
*
* Copyright © 2019 Intel Corporation
*/
#ifndef INTEL_RPS_TYPES_H
#define INTEL_RPS_TYPES_H
#include <linux/atomic.h>
#include <linux/ktime.h>
#include <linux/mutex.h>
#include <linux/types.h>
#include <linux/workqueue.h>
struct intel_ips {
u64 last_count1;
unsigned long last_time1;
unsigned long chipset_power;
u64 last_count2;
u64 last_time2;
unsigned long gfx_power;
u8 corr;
int c, m;
};
struct intel_rps_ei {
ktime_t ktime;
u32 render_c0;
u32 media_c0;
};
struct intel_rps {
struct mutex lock; /* protects enabling and the worker */
/*
* work, interrupts_enabled and pm_iir are protected by
* dev_priv->irq_lock
*/
struct work_struct work;
bool enabled;
bool active;
u32 pm_iir;
/* PM interrupt bits that should never be masked */
u32 pm_intrmsk_mbz;
u32 pm_events;
/* Frequencies are stored in potentially platform dependent multiples.
* In other words, *_freq needs to be multiplied by X to be interesting.
* Soft limits are those which are used for the dynamic reclocking done
* by the driver (raise frequencies under heavy loads, and lower for
* lighter loads). Hard limits are those imposed by the hardware.
*
* A distinction is made for overclocking, which is never enabled by
* default, and is considered to be above the hard limit if it's
* possible at all.
*/
u8 cur_freq; /* Current frequency (cached, may not == HW) */
u8 last_freq; /* Last SWREQ frequency */
u8 min_freq_softlimit; /* Minimum frequency permitted by the driver */
u8 max_freq_softlimit; /* Max frequency permitted by the driver */
u8 max_freq; /* Maximum frequency, RP0 if not overclocking */
u8 min_freq; /* AKA RPn. Minimum frequency */
u8 boost_freq; /* Frequency to request when wait boosting */
u8 idle_freq; /* Frequency to request when we are idle */
u8 efficient_freq; /* AKA RPe. Pre-determined balanced frequency */
u8 rp1_freq; /* "less than" RP0 power/freqency */
u8 rp0_freq; /* Non-overclocked max frequency. */
u16 gpll_ref_freq; /* vlv/chv GPLL reference frequency */
int last_adj;
struct {
struct mutex mutex;
enum { LOW_POWER, BETWEEN, HIGH_POWER } mode;
unsigned int interactive;
u8 up_threshold; /* Current %busy required to uplock */
u8 down_threshold; /* Current %busy required to downclock */
} power;
atomic_t num_waiters;
atomic_t boosts;
/* manual wa residency calculations */
struct intel_rps_ei ei;
struct intel_ips ips;
};
#endif /* INTEL_RPS_TYPES_H */

View File

@ -4,13 +4,13 @@
* Copyright © 2016-2018 Intel Corporation
*/
#include "gt/intel_gt_types.h"
#include "i915_drv.h"
#include "i915_active.h"
#include "i915_syncmap.h"
#include "gt/intel_timeline.h"
#include "intel_gt.h"
#include "intel_ring.h"
#include "intel_timeline.h"
#define ptr_set_bit(ptr, bit) ((typeof(ptr))((unsigned long)(ptr) | BIT(bit)))
#define ptr_test_bit(ptr, bit) ((unsigned long)(ptr) & BIT(bit))

View File

@ -7,6 +7,7 @@
#include "i915_drv.h"
#include "intel_context.h"
#include "intel_gt.h"
#include "intel_ring.h"
#include "intel_workarounds.h"
/**
@ -1215,6 +1216,26 @@ static void icl_whitelist_build(struct intel_engine_cs *engine)
static void tgl_whitelist_build(struct intel_engine_cs *engine)
{
struct i915_wa_list *w = &engine->whitelist;
switch (engine->class) {
case RENDER_CLASS:
/*
* WaAllowPMDepthAndInvocationCountAccessFromUMD:tgl
*
* This covers 4 registers which are next to one another :
* - PS_INVOCATION_COUNT
* - PS_INVOCATION_COUNT_UDW
* - PS_DEPTH_COUNT
* - PS_DEPTH_COUNT_UDW
*/
whitelist_reg_ext(w, PS_INVOCATION_COUNT,
RING_FORCE_TO_NONPRIV_ACCESS_RD |
RING_FORCE_TO_NONPRIV_RANGE_4);
break;
default:
break;
}
}
void intel_engine_init_whitelist(struct intel_engine_cs *engine)

View File

@ -23,6 +23,7 @@
*/
#include "gem/i915_gem_context.h"
#include "gt/intel_ring.h"
#include "i915_drv.h"
#include "intel_context.h"

View File

@ -103,9 +103,6 @@ static int __live_context_size(struct intel_engine_cs *engine,
*
* TLDR; this overlaps with the execlists redzone.
*/
if (HAS_EXECLISTS(engine->i915))
vaddr += LRC_HEADER_PAGES * PAGE_SIZE;
vaddr += engine->context_size - I915_GTT_PAGE_SIZE;
memset(vaddr, POISON_INUSE, I915_GTT_PAGE_SIZE);

View File

@ -0,0 +1,350 @@
/*
* SPDX-License-Identifier: MIT
*
* Copyright © 2018 Intel Corporation
*/
#include <linux/sort.h>
#include "i915_drv.h"
#include "intel_gt_requests.h"
#include "i915_selftest.h"
struct pulse {
struct i915_active active;
struct kref kref;
};
static int pulse_active(struct i915_active *active)
{
kref_get(&container_of(active, struct pulse, active)->kref);
return 0;
}
static void pulse_free(struct kref *kref)
{
kfree(container_of(kref, struct pulse, kref));
}
static void pulse_put(struct pulse *p)
{
kref_put(&p->kref, pulse_free);
}
static void pulse_retire(struct i915_active *active)
{
pulse_put(container_of(active, struct pulse, active));
}
static struct pulse *pulse_create(void)
{
struct pulse *p;
p = kmalloc(sizeof(*p), GFP_KERNEL);
if (!p)
return p;
kref_init(&p->kref);
i915_active_init(&p->active, pulse_active, pulse_retire);
return p;
}
static void pulse_unlock_wait(struct pulse *p)
{
mutex_lock(&p->active.mutex);
mutex_unlock(&p->active.mutex);
flush_work(&p->active.work);
}
static int __live_idle_pulse(struct intel_engine_cs *engine,
int (*fn)(struct intel_engine_cs *cs))
{
struct pulse *p;
int err;
GEM_BUG_ON(!intel_engine_pm_is_awake(engine));
p = pulse_create();
if (!p)
return -ENOMEM;
err = i915_active_acquire(&p->active);
if (err)
goto out;
err = i915_active_acquire_preallocate_barrier(&p->active, engine);
if (err) {
i915_active_release(&p->active);
goto out;
}
i915_active_acquire_barrier(&p->active);
i915_active_release(&p->active);
GEM_BUG_ON(i915_active_is_idle(&p->active));
GEM_BUG_ON(llist_empty(&engine->barrier_tasks));
err = fn(engine);
if (err)
goto out;
GEM_BUG_ON(!llist_empty(&engine->barrier_tasks));
if (intel_gt_retire_requests_timeout(engine->gt, HZ / 5)) {
err = -ETIME;
goto out;
}
GEM_BUG_ON(READ_ONCE(engine->serial) != engine->wakeref_serial);
pulse_unlock_wait(p); /* synchronize with the retirement callback */
if (!i915_active_is_idle(&p->active)) {
struct drm_printer m = drm_err_printer("pulse");
pr_err("%s: heartbeat pulse did not flush idle tasks\n",
engine->name);
i915_active_print(&p->active, &m);
err = -EINVAL;
goto out;
}
out:
pulse_put(p);
return err;
}
static int live_idle_flush(void *arg)
{
struct intel_gt *gt = arg;
struct intel_engine_cs *engine;
enum intel_engine_id id;
int err = 0;
/* Check that we can flush the idle barriers */
for_each_engine(engine, gt, id) {
intel_engine_pm_get(engine);
err = __live_idle_pulse(engine, intel_engine_flush_barriers);
intel_engine_pm_put(engine);
if (err)
break;
}
return err;
}
static int live_idle_pulse(void *arg)
{
struct intel_gt *gt = arg;
struct intel_engine_cs *engine;
enum intel_engine_id id;
int err = 0;
/* Check that heartbeat pulses flush the idle barriers */
for_each_engine(engine, gt, id) {
intel_engine_pm_get(engine);
err = __live_idle_pulse(engine, intel_engine_pulse);
intel_engine_pm_put(engine);
if (err && err != -ENODEV)
break;
err = 0;
}
return err;
}
static int cmp_u32(const void *_a, const void *_b)
{
const u32 *a = _a, *b = _b;
return *a - *b;
}
static int __live_heartbeat_fast(struct intel_engine_cs *engine)
{
struct intel_context *ce;
struct i915_request *rq;
ktime_t t0, t1;
u32 times[5];
int err;
int i;
ce = intel_context_create(engine->kernel_context->gem_context,
engine);
if (IS_ERR(ce))
return PTR_ERR(ce);
intel_engine_pm_get(engine);
err = intel_engine_set_heartbeat(engine, 1);
if (err)
goto err_pm;
for (i = 0; i < ARRAY_SIZE(times); i++) {
/* Manufacture a tick */
do {
while (READ_ONCE(engine->heartbeat.systole))
flush_delayed_work(&engine->heartbeat.work);
engine->serial++; /* quick, pretend we are not idle! */
flush_delayed_work(&engine->heartbeat.work);
if (!delayed_work_pending(&engine->heartbeat.work)) {
pr_err("%s: heartbeat did not start\n",
engine->name);
err = -EINVAL;
goto err_pm;
}
rcu_read_lock();
rq = READ_ONCE(engine->heartbeat.systole);
if (rq)
rq = i915_request_get_rcu(rq);
rcu_read_unlock();
} while (!rq);
t0 = ktime_get();
while (rq == READ_ONCE(engine->heartbeat.systole))
yield(); /* work is on the local cpu! */
t1 = ktime_get();
i915_request_put(rq);
times[i] = ktime_us_delta(t1, t0);
}
sort(times, ARRAY_SIZE(times), sizeof(times[0]), cmp_u32, NULL);
pr_info("%s: Heartbeat delay: %uus [%u, %u]\n",
engine->name,
times[ARRAY_SIZE(times) / 2],
times[0],
times[ARRAY_SIZE(times) - 1]);
/* Min work delay is 2 * 2 (worst), +1 for scheduling, +1 for slack */
if (times[ARRAY_SIZE(times) / 2] > jiffies_to_usecs(6)) {
pr_err("%s: Heartbeat delay was %uus, expected less than %dus\n",
engine->name,
times[ARRAY_SIZE(times) / 2],
jiffies_to_usecs(6));
err = -EINVAL;
}
intel_engine_set_heartbeat(engine, CONFIG_DRM_I915_HEARTBEAT_INTERVAL);
err_pm:
intel_engine_pm_put(engine);
intel_context_put(ce);
return err;
}
static int live_heartbeat_fast(void *arg)
{
struct intel_gt *gt = arg;
struct intel_engine_cs *engine;
enum intel_engine_id id;
int err = 0;
/* Check that the heartbeat ticks at the desired rate. */
if (!CONFIG_DRM_I915_HEARTBEAT_INTERVAL)
return 0;
for_each_engine(engine, gt, id) {
err = __live_heartbeat_fast(engine);
if (err)
break;
}
return err;
}
static int __live_heartbeat_off(struct intel_engine_cs *engine)
{
int err;
intel_engine_pm_get(engine);
engine->serial++;
flush_delayed_work(&engine->heartbeat.work);
if (!delayed_work_pending(&engine->heartbeat.work)) {
pr_err("%s: heartbeat not running\n",
engine->name);
err = -EINVAL;
goto err_pm;
}
err = intel_engine_set_heartbeat(engine, 0);
if (err)
goto err_pm;
engine->serial++;
flush_delayed_work(&engine->heartbeat.work);
if (delayed_work_pending(&engine->heartbeat.work)) {
pr_err("%s: heartbeat still running\n",
engine->name);
err = -EINVAL;
goto err_beat;
}
if (READ_ONCE(engine->heartbeat.systole)) {
pr_err("%s: heartbeat still allocated\n",
engine->name);
err = -EINVAL;
goto err_beat;
}
err_beat:
intel_engine_set_heartbeat(engine, CONFIG_DRM_I915_HEARTBEAT_INTERVAL);
err_pm:
intel_engine_pm_put(engine);
return err;
}
static int live_heartbeat_off(void *arg)
{
struct intel_gt *gt = arg;
struct intel_engine_cs *engine;
enum intel_engine_id id;
int err = 0;
/* Check that we can turn off heartbeat and not interrupt VIP */
if (!CONFIG_DRM_I915_HEARTBEAT_INTERVAL)
return 0;
for_each_engine(engine, gt, id) {
if (!intel_engine_has_preemption(engine))
continue;
err = __live_heartbeat_off(engine);
if (err)
break;
}
return err;
}
int intel_heartbeat_live_selftests(struct drm_i915_private *i915)
{
static const struct i915_subtest tests[] = {
SUBTEST(live_idle_flush),
SUBTEST(live_idle_pulse),
SUBTEST(live_heartbeat_fast),
SUBTEST(live_heartbeat_off),
};
int saved_hangcheck;
int err;
if (intel_gt_is_wedged(&i915->gt))
return 0;
saved_hangcheck = i915_modparams.enable_hangcheck;
i915_modparams.enable_hangcheck = INT_MAX;
err = intel_gt_live_subtests(tests, &i915->gt);
i915_modparams.enable_hangcheck = saved_hangcheck;
return err;
}

View File

@ -826,6 +826,8 @@ static int __igt_reset_engines(struct intel_gt *gt,
get_task_struct(tsk);
}
yield(); /* start all threads before we begin */
intel_engine_pm_get(engine);
set_bit(I915_RESET_ENGINE + id, &gt->reset.flags);
do {
@ -1016,7 +1018,7 @@ static int igt_reset_wait(void *arg)
{
struct intel_gt *gt = arg;
struct i915_gpu_error *global = &gt->i915->gpu_error;
struct intel_engine_cs *engine = gt->i915->engine[RCS0];
struct intel_engine_cs *engine = gt->engine[RCS0];
struct i915_request *rq;
unsigned int reset_count;
struct hang h;
@ -1143,14 +1145,18 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
int (*fn)(void *),
unsigned int flags)
{
struct intel_engine_cs *engine = gt->i915->engine[RCS0];
struct intel_engine_cs *engine = gt->engine[RCS0];
struct drm_i915_gem_object *obj;
struct task_struct *tsk = NULL;
struct i915_request *rq;
struct evict_vma arg;
struct hang h;
unsigned int pin_flags;
int err;
if (!gt->ggtt->num_fences && flags & EXEC_OBJECT_NEEDS_FENCE)
return 0;
if (!engine || !intel_engine_can_store_dword(engine))
return 0;
@ -1186,10 +1192,12 @@ static int __igt_reset_evict_vma(struct intel_gt *gt,
goto out_obj;
}
err = i915_vma_pin(arg.vma, 0, 0,
i915_vma_is_ggtt(arg.vma) ?
PIN_GLOBAL | PIN_MAPPABLE :
PIN_USER);
pin_flags = i915_vma_is_ggtt(arg.vma) ? PIN_GLOBAL : PIN_USER;
if (flags & EXEC_OBJECT_NEEDS_FENCE)
pin_flags |= PIN_MAPPABLE;
err = i915_vma_pin(arg.vma, 0, 0, pin_flags);
if (err) {
i915_request_add(rq);
goto out_obj;
@ -1493,7 +1501,7 @@ static int igt_handle_error(void *arg)
{
struct intel_gt *gt = arg;
struct i915_gpu_error *global = &gt->i915->gpu_error;
struct intel_engine_cs *engine = gt->i915->engine[RCS0];
struct intel_engine_cs *engine = gt->engine[RCS0];
struct hang h;
struct i915_request *rq;
struct i915_gpu_state *error;
@ -1563,7 +1571,7 @@ static int __igt_atomic_reset_engine(struct intel_engine_cs *engine,
GEM_TRACE("i915_reset_engine(%s:%s) under %s\n",
engine->name, mode, p->name);
tasklet_disable_nosync(t);
tasklet_disable(t);
p->critical_section_begin();
err = intel_engine_reset(engine, NULL);
@ -1686,7 +1694,6 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915)
};
struct intel_gt *gt = &i915->gt;
intel_wakeref_t wakeref;
bool saved_hangcheck;
int err;
if (!intel_has_gpu_reset(gt))
@ -1696,12 +1703,9 @@ int intel_hangcheck_live_selftests(struct drm_i915_private *i915)
return -EIO; /* we're long past hope of a successful reset */
wakeref = intel_runtime_pm_get(gt->uncore->rpm);
saved_hangcheck = fetch_and_zero(&i915_modparams.enable_hangcheck);
drain_delayed_work(&gt->hangcheck.work); /* flush param */
err = intel_gt_live_subtests(tests, gt);
i915_modparams.enable_hangcheck = saved_hangcheck;
intel_runtime_pm_put(gt->uncore->rpm, wakeref);
return err;

View File

@ -6,6 +6,7 @@
#include "intel_pm.h" /* intel_gpu_freq() */
#include "selftest_llc.h"
#include "intel_rps.h"
static int gen6_verify_ring_freq(struct intel_llc *llc)
{
@ -25,6 +26,8 @@ static int gen6_verify_ring_freq(struct intel_llc *llc)
for (gpu_freq = consts.min_gpu_freq;
gpu_freq <= consts.max_gpu_freq;
gpu_freq++) {
struct intel_rps *rps = &llc_to_gt(llc)->rps;
unsigned int ia_freq, ring_freq, found;
u32 val;
@ -44,7 +47,7 @@ static int gen6_verify_ring_freq(struct intel_llc *llc)
if (found != ia_freq) {
pr_err("Min freq table(%d/[%d, %d]):%dMHz did not match expected CPU freq, found %d, expected %d\n",
gpu_freq, consts.min_gpu_freq, consts.max_gpu_freq,
intel_gpu_freq(i915, gpu_freq * (INTEL_GEN(i915) >= 9 ? GEN9_FREQ_SCALER : 1)),
intel_gpu_freq(rps, gpu_freq * (INTEL_GEN(i915) >= 9 ? GEN9_FREQ_SCALER : 1)),
found, ia_freq);
err = -EINVAL;
break;
@ -54,7 +57,7 @@ static int gen6_verify_ring_freq(struct intel_llc *llc)
if (found != ring_freq) {
pr_err("Min freq table(%d/[%d, %d]):%dMHz did not match expected ring freq, found %d, expected %d\n",
gpu_freq, consts.min_gpu_freq, consts.max_gpu_freq,
intel_gpu_freq(i915, gpu_freq * (INTEL_GEN(i915) >= 9 ? GEN9_FREQ_SCALER : 1)),
intel_gpu_freq(rps, gpu_freq * (INTEL_GEN(i915) >= 9 ? GEN9_FREQ_SCALER : 1)),
found, ring_freq);
err = -EINVAL;
break;

View File

@ -7,6 +7,7 @@
#include <linux/prime_numbers.h>
#include "gem/i915_gem_pm.h"
#include "gt/intel_engine_heartbeat.h"
#include "gt/intel_reset.h"
#include "i915_selftest.h"
@ -168,12 +169,7 @@ static int live_unlite_restore(struct intel_gt *gt, int prio)
}
GEM_BUG_ON(!ce[1]->ring->size);
intel_ring_reset(ce[1]->ring, ce[1]->ring->size / 2);
local_irq_disable(); /* appease lockdep */
__context_pin_acquire(ce[1]);
__execlists_update_reg_state(ce[1], engine);
__context_pin_release(ce[1]);
local_irq_enable();
rq[0] = igt_spinner_create_request(&spin, ce[0], MI_ARB_CHECK);
if (IS_ERR(rq[0])) {
@ -444,6 +440,8 @@ static int live_timeslice_preempt(void *arg)
* need to preempt the current task and replace it with another
* ready task.
*/
if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
return 0;
obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
if (IS_ERR(obj))
@ -518,6 +516,11 @@ static void wait_for_submit(struct intel_engine_cs *engine,
} while (!i915_request_is_active(rq));
}
static long timeslice_threshold(const struct intel_engine_cs *engine)
{
return 2 * msecs_to_jiffies_timeout(timeslice(engine)) + 1;
}
static int live_timeslice_queue(void *arg)
{
struct intel_gt *gt = arg;
@ -535,6 +538,8 @@ static int live_timeslice_queue(void *arg)
* ELSP[1] is already occupied, so must rely on timeslicing to
* eject ELSP[0] in favour of the queue.)
*/
if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
return 0;
obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
if (IS_ERR(obj))
@ -612,8 +617,8 @@ static int live_timeslice_queue(void *arg)
err = -EINVAL;
}
/* Timeslice every jiffie, so within 2 we should signal */
if (i915_request_wait(rq, 0, 3) < 0) {
/* Timeslice every jiffy, so within 2 we should signal */
if (i915_request_wait(rq, 0, timeslice_threshold(engine)) < 0) {
struct drm_printer p =
drm_info_printer(gt->i915->drm.dev);
@ -1165,6 +1170,325 @@ err_wedged:
goto err_client_b;
}
struct live_preempt_cancel {
struct intel_engine_cs *engine;
struct preempt_client a, b;
};
static int __cancel_active0(struct live_preempt_cancel *arg)
{
struct i915_request *rq;
struct igt_live_test t;
int err;
/* Preempt cancel of ELSP0 */
GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
if (igt_live_test_begin(&t, arg->engine->i915,
__func__, arg->engine->name))
return -EIO;
clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
rq = spinner_create_request(&arg->a.spin,
arg->a.ctx, arg->engine,
MI_ARB_CHECK);
if (IS_ERR(rq))
return PTR_ERR(rq);
i915_request_get(rq);
i915_request_add(rq);
if (!igt_wait_for_spinner(&arg->a.spin, rq)) {
err = -EIO;
goto out;
}
i915_gem_context_set_banned(arg->a.ctx);
err = intel_engine_pulse(arg->engine);
if (err)
goto out;
if (i915_request_wait(rq, 0, HZ / 5) < 0) {
err = -EIO;
goto out;
}
if (rq->fence.error != -EIO) {
pr_err("Cancelled inflight0 request did not report -EIO\n");
err = -EINVAL;
goto out;
}
out:
i915_request_put(rq);
if (igt_live_test_end(&t))
err = -EIO;
return err;
}
static int __cancel_active1(struct live_preempt_cancel *arg)
{
struct i915_request *rq[2] = {};
struct igt_live_test t;
int err;
/* Preempt cancel of ELSP1 */
GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
if (igt_live_test_begin(&t, arg->engine->i915,
__func__, arg->engine->name))
return -EIO;
clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
rq[0] = spinner_create_request(&arg->a.spin,
arg->a.ctx, arg->engine,
MI_NOOP); /* no preemption */
if (IS_ERR(rq[0]))
return PTR_ERR(rq[0]);
i915_request_get(rq[0]);
i915_request_add(rq[0]);
if (!igt_wait_for_spinner(&arg->a.spin, rq[0])) {
err = -EIO;
goto out;
}
clear_bit(CONTEXT_BANNED, &arg->b.ctx->flags);
rq[1] = spinner_create_request(&arg->b.spin,
arg->b.ctx, arg->engine,
MI_ARB_CHECK);
if (IS_ERR(rq[1])) {
err = PTR_ERR(rq[1]);
goto out;
}
i915_request_get(rq[1]);
err = i915_request_await_dma_fence(rq[1], &rq[0]->fence);
i915_request_add(rq[1]);
if (err)
goto out;
i915_gem_context_set_banned(arg->b.ctx);
err = intel_engine_pulse(arg->engine);
if (err)
goto out;
igt_spinner_end(&arg->a.spin);
if (i915_request_wait(rq[1], 0, HZ / 5) < 0) {
err = -EIO;
goto out;
}
if (rq[0]->fence.error != 0) {
pr_err("Normal inflight0 request did not complete\n");
err = -EINVAL;
goto out;
}
if (rq[1]->fence.error != -EIO) {
pr_err("Cancelled inflight1 request did not report -EIO\n");
err = -EINVAL;
goto out;
}
out:
i915_request_put(rq[1]);
i915_request_put(rq[0]);
if (igt_live_test_end(&t))
err = -EIO;
return err;
}
static int __cancel_queued(struct live_preempt_cancel *arg)
{
struct i915_request *rq[3] = {};
struct igt_live_test t;
int err;
/* Full ELSP and one in the wings */
GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
if (igt_live_test_begin(&t, arg->engine->i915,
__func__, arg->engine->name))
return -EIO;
clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
rq[0] = spinner_create_request(&arg->a.spin,
arg->a.ctx, arg->engine,
MI_ARB_CHECK);
if (IS_ERR(rq[0]))
return PTR_ERR(rq[0]);
i915_request_get(rq[0]);
i915_request_add(rq[0]);
if (!igt_wait_for_spinner(&arg->a.spin, rq[0])) {
err = -EIO;
goto out;
}
clear_bit(CONTEXT_BANNED, &arg->b.ctx->flags);
rq[1] = igt_request_alloc(arg->b.ctx, arg->engine);
if (IS_ERR(rq[1])) {
err = PTR_ERR(rq[1]);
goto out;
}
i915_request_get(rq[1]);
err = i915_request_await_dma_fence(rq[1], &rq[0]->fence);
i915_request_add(rq[1]);
if (err)
goto out;
rq[2] = spinner_create_request(&arg->b.spin,
arg->a.ctx, arg->engine,
MI_ARB_CHECK);
if (IS_ERR(rq[2])) {
err = PTR_ERR(rq[2]);
goto out;
}
i915_request_get(rq[2]);
err = i915_request_await_dma_fence(rq[2], &rq[1]->fence);
i915_request_add(rq[2]);
if (err)
goto out;
i915_gem_context_set_banned(arg->a.ctx);
err = intel_engine_pulse(arg->engine);
if (err)
goto out;
if (i915_request_wait(rq[2], 0, HZ / 5) < 0) {
err = -EIO;
goto out;
}
if (rq[0]->fence.error != -EIO) {
pr_err("Cancelled inflight0 request did not report -EIO\n");
err = -EINVAL;
goto out;
}
if (rq[1]->fence.error != 0) {
pr_err("Normal inflight1 request did not complete\n");
err = -EINVAL;
goto out;
}
if (rq[2]->fence.error != -EIO) {
pr_err("Cancelled queued request did not report -EIO\n");
err = -EINVAL;
goto out;
}
out:
i915_request_put(rq[2]);
i915_request_put(rq[1]);
i915_request_put(rq[0]);
if (igt_live_test_end(&t))
err = -EIO;
return err;
}
static int __cancel_hostile(struct live_preempt_cancel *arg)
{
struct i915_request *rq;
int err;
/* Preempt cancel non-preemptible spinner in ELSP0 */
if (!IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT))
return 0;
GEM_TRACE("%s(%s)\n", __func__, arg->engine->name);
clear_bit(CONTEXT_BANNED, &arg->a.ctx->flags);
rq = spinner_create_request(&arg->a.spin,
arg->a.ctx, arg->engine,
MI_NOOP); /* preemption disabled */
if (IS_ERR(rq))
return PTR_ERR(rq);
i915_request_get(rq);
i915_request_add(rq);
if (!igt_wait_for_spinner(&arg->a.spin, rq)) {
err = -EIO;
goto out;
}
i915_gem_context_set_banned(arg->a.ctx);
err = intel_engine_pulse(arg->engine); /* force reset */
if (err)
goto out;
if (i915_request_wait(rq, 0, HZ / 5) < 0) {
err = -EIO;
goto out;
}
if (rq->fence.error != -EIO) {
pr_err("Cancelled inflight0 request did not report -EIO\n");
err = -EINVAL;
goto out;
}
out:
i915_request_put(rq);
if (igt_flush_test(arg->engine->i915))
err = -EIO;
return err;
}
static int live_preempt_cancel(void *arg)
{
struct intel_gt *gt = arg;
struct live_preempt_cancel data;
enum intel_engine_id id;
int err = -ENOMEM;
/*
* To cancel an inflight context, we need to first remove it from the
* GPU. That sounds like preemption! Plus a little bit of bookkeeping.
*/
if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
return 0;
if (preempt_client_init(gt, &data.a))
return -ENOMEM;
if (preempt_client_init(gt, &data.b))
goto err_client_a;
for_each_engine(data.engine, gt, id) {
if (!intel_engine_has_preemption(data.engine))
continue;
err = __cancel_active0(&data);
if (err)
goto err_wedged;
err = __cancel_active1(&data);
if (err)
goto err_wedged;
err = __cancel_queued(&data);
if (err)
goto err_wedged;
err = __cancel_hostile(&data);
if (err)
goto err_wedged;
}
err = 0;
err_client_b:
preempt_client_fini(&data.b);
err_client_a:
preempt_client_fini(&data.a);
return err;
err_wedged:
GEM_TRACE_DUMP();
igt_spinner_end(&data.b.spin);
igt_spinner_end(&data.a.spin);
intel_gt_set_wedged(gt);
goto err_client_b;
}
static int live_suppress_self_preempt(void *arg)
{
struct intel_gt *gt = arg;
@ -1702,6 +2026,105 @@ err_spin_hi:
return err;
}
static int live_preempt_timeout(void *arg)
{
struct intel_gt *gt = arg;
struct i915_gem_context *ctx_hi, *ctx_lo;
struct igt_spinner spin_lo;
struct intel_engine_cs *engine;
enum intel_engine_id id;
int err = -ENOMEM;
/*
* Check that we force preemption to occur by cancelling the previous
* context if it refuses to yield the GPU.
*/
if (!IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT))
return 0;
if (!HAS_LOGICAL_RING_PREEMPTION(gt->i915))
return 0;
if (!intel_has_reset_engine(gt))
return 0;
if (igt_spinner_init(&spin_lo, gt))
return -ENOMEM;
ctx_hi = kernel_context(gt->i915);
if (!ctx_hi)
goto err_spin_lo;
ctx_hi->sched.priority =
I915_USER_PRIORITY(I915_CONTEXT_MAX_USER_PRIORITY);
ctx_lo = kernel_context(gt->i915);
if (!ctx_lo)
goto err_ctx_hi;
ctx_lo->sched.priority =
I915_USER_PRIORITY(I915_CONTEXT_MIN_USER_PRIORITY);
for_each_engine(engine, gt, id) {
unsigned long saved_timeout;
struct i915_request *rq;
if (!intel_engine_has_preemption(engine))
continue;
rq = spinner_create_request(&spin_lo, ctx_lo, engine,
MI_NOOP); /* preemption disabled */
if (IS_ERR(rq)) {
err = PTR_ERR(rq);
goto err_ctx_lo;
}
i915_request_add(rq);
if (!igt_wait_for_spinner(&spin_lo, rq)) {
intel_gt_set_wedged(gt);
err = -EIO;
goto err_ctx_lo;
}
rq = igt_request_alloc(ctx_hi, engine);
if (IS_ERR(rq)) {
igt_spinner_end(&spin_lo);
err = PTR_ERR(rq);
goto err_ctx_lo;
}
/* Flush the previous CS ack before changing timeouts */
while (READ_ONCE(engine->execlists.pending[0]))
cpu_relax();
saved_timeout = engine->props.preempt_timeout_ms;
engine->props.preempt_timeout_ms = 1; /* in ms, -> 1 jiffie */
i915_request_get(rq);
i915_request_add(rq);
intel_engine_flush_submission(engine);
engine->props.preempt_timeout_ms = saved_timeout;
if (i915_request_wait(rq, 0, HZ / 10) < 0) {
intel_gt_set_wedged(gt);
i915_request_put(rq);
err = -ETIME;
goto err_ctx_lo;
}
igt_spinner_end(&spin_lo);
i915_request_put(rq);
}
err = 0;
err_ctx_lo:
kernel_context_close(ctx_lo);
err_ctx_hi:
kernel_context_close(ctx_hi);
err_spin_lo:
igt_spinner_fini(&spin_lo);
return err;
}
static int random_range(struct rnd_state *rnd, int min, int max)
{
return i915_prandom_u32_max_state(max - min, rnd) + min;
@ -1829,6 +2252,8 @@ static int smoke_crescendo(struct preempt_smoke *smoke, unsigned int flags)
get_task_struct(tsk[id]);
}
yield(); /* start all threads before we kthread_stop() */
count = 0;
for_each_engine(engine, smoke->gt, id) {
int status;
@ -2599,10 +3024,12 @@ int intel_execlists_live_selftests(struct drm_i915_private *i915)
SUBTEST(live_preempt),
SUBTEST(live_late_preempt),
SUBTEST(live_nopreempt),
SUBTEST(live_preempt_cancel),
SUBTEST(live_suppress_self_preempt),
SUBTEST(live_suppress_wait_preempt),
SUBTEST(live_chain_preempt),
SUBTEST(live_preempt_hang),
SUBTEST(live_preempt_timeout),
SUBTEST(live_preempt_smoke),
SUBTEST(live_virtual_engine),
SUBTEST(live_virtual_mask),
@ -2749,6 +3176,100 @@ static int live_lrc_layout(void *arg)
return err;
}
static int find_offset(const u32 *lri, u32 offset)
{
int i;
for (i = 0; i < PAGE_SIZE / sizeof(u32); i++)
if (lri[i] == offset)
return i;
return -1;
}
static int live_lrc_fixed(void *arg)
{
struct intel_gt *gt = arg;
struct intel_engine_cs *engine;
enum intel_engine_id id;
int err = 0;
/*
* Check the assumed register offsets match the actual locations in
* the context image.
*/
for_each_engine(engine, gt, id) {
const struct {
u32 reg;
u32 offset;
const char *name;
} tbl[] = {
{
i915_mmio_reg_offset(RING_START(engine->mmio_base)),
CTX_RING_BUFFER_START - 1,
"RING_START"
},
{
i915_mmio_reg_offset(RING_CTL(engine->mmio_base)),
CTX_RING_BUFFER_CONTROL - 1,
"RING_CTL"
},
{
i915_mmio_reg_offset(RING_HEAD(engine->mmio_base)),
CTX_RING_HEAD - 1,
"RING_HEAD"
},
{
i915_mmio_reg_offset(RING_TAIL(engine->mmio_base)),
CTX_RING_TAIL - 1,
"RING_TAIL"
},
{
i915_mmio_reg_offset(RING_MI_MODE(engine->mmio_base)),
lrc_ring_mi_mode(engine),
"RING_MI_MODE"
},
{
engine->mmio_base + 0x110,
CTX_BB_STATE - 1,
"BB_STATE"
},
{ },
}, *t;
u32 *hw;
if (!engine->default_state)
continue;
hw = i915_gem_object_pin_map(engine->default_state,
I915_MAP_WB);
if (IS_ERR(hw)) {
err = PTR_ERR(hw);
break;
}
hw += LRC_STATE_PN * PAGE_SIZE / sizeof(*hw);
for (t = tbl; t->name; t++) {
int dw = find_offset(hw, t->reg);
if (dw != t->offset) {
pr_err("%s: Offset for %s [0x%x] mismatch, found %x, expected %x\n",
engine->name,
t->name,
t->reg,
dw,
t->offset);
err = -EINVAL;
}
}
i915_gem_object_unpin_map(engine->default_state);
}
return err;
}
static int __live_lrc_state(struct i915_gem_context *fixme,
struct intel_engine_cs *engine,
struct i915_vma *scratch)
@ -3021,6 +3542,7 @@ int intel_lrc_live_selftests(struct drm_i915_private *i915)
{
static const struct i915_subtest tests[] = {
SUBTEST(live_lrc_layout),
SUBTEST(live_lrc_fixed),
SUBTEST(live_lrc_state),
SUBTEST(live_gpr_clear),
};

View File

@ -126,7 +126,7 @@ static int igt_atomic_engine_reset(void *arg)
goto out_unlock;
for_each_engine(engine, gt, id) {
tasklet_disable_nosync(&engine->execlists.tasklet);
tasklet_disable(&engine->execlists.tasklet);
intel_engine_pm_get(engine);
for (p = igt_atomic_phases; p->name; p++) {

Some files were not shown because too many files have changed in this diff Show More