Merge tag 'drm-intel-gt-next-2021-10-08' of git://anongit.freedesktop.org/drm/drm-intel into drm-next

UAPI Changes:

- Add uAPI for using PXP protected objects

  Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/8064

- Add PCI IDs and LMEM discovery/placement uAPI for DG1

  Mesa changes: https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/11584

- Disable engine bonding on Gen12+ except TGL, RKL and ADL-S

Cross-subsystem Changes:

- Merges 'tip/locking/wwmutex' branch (core kernel tip)
- "mei: pxp: export pavp client to me client bus"

Core Changes:

- Update ttm_move_memcpy for async use (Thomas)

Driver Changes:

- Enable GuC submission by default on DG1 (Matt B)
- Add PXP (Protected Xe Path) support for Gen12 integrated (Daniele,
  Sean, Anshuman)
  See "drm/i915/pxp: add PXP documentation" for details!
- Remove force_probe protection for ADL-S (Raviteja)
- Add base support for XeHP/XeHP SDV (Matt R, Stuart, Lucas)
- Handle DRI_PRIME=1 on Intel igfx + Intel dgfx hybrid graphics setup (Tvrtko)
- Use Transparent Hugepages when IOMMU is enabled (Tvrtko, Chris)
- Implement LMEM backup and restore for suspend / resume (Thomas)
- Report INSTDONE_GEOM values in error state for DG2 (Matt R)
- Add DG2-specific shadow register table (Matt R)
- Update Gen11/Gen12/XeHP shadow register tables (Matt R)
- Maintain backward-compatible nested batch behavior on TGL+ (Matt R)
- Add new LRI reg offsets for DG2 (Akeem)
- Initialize unused MOCS entries to device specific values (Ayaz)
- Track and use the correct UC MOCS index on Gen12 (Ayaz)
- Add separate MOCS table for Gen12 devices other than TGL/RKL (Ayaz)
- Simplify the locking and eliminate some RCU usage (Daniel)
- Add some flushing for the 64K GTT path (Matt A)
- Mark GPU wedging on driver unregister unrecoverable (Janusz)

- Major rework in the GuC codebase, simplify locking and add docs (Matt B)
- Add DG1 GuC/HuC firmwares (Daniele, Matt B)
- Remember to call i915_sw_fence_fini on guc_state.blocked (Matt A)
- Use "gt" forcewake domain name for error messages instead of "blitter" (Matt R)
- Drop now duplicate LMEM uAPI RFC kerneldoc section (Daniel)
- Fix early tracepoints for requests (Matt A)
- Use locked access to ctx->engines in set_priority (Daniel)
- Convert gen6/gen7/gen8 read operations to fwtable (Matt R)
- Drop gen11/gen12 specific mmio write handlers (Matt R)
- Drop gen11 specific mmio read handlers (Matt R)
- Use designated initializers for init/exit table (Kees)
- Fix syncmap memory leak (Matt B)
- Add pretty printing for buddy allocator state debug (Matt A)
- Fix potential error pointer dereference in pinned_context() (Dan)
- Remove IS_ACTIVE macro (Lucas)
- Static code checker fixes (Nathan)
- Clean up disabled warnings (Nathan)
- Increase timeout in i915_gem_contexts selftests 5x for GuC submission (Matt B)
- Ensure wa_init_finish() is called for ctx workaround list (Matt R)
- Initialize L3CC table in mocs init (Sreedhar, Ayaz, Ram)
- Get PM ref before accessing HW register (Vinay)
- Move __i915_gem_free_object to ttm_bo_destroy (Maarten)
- Deduplicate frequency dump on debugfs (Lucas)
- Make wa list per-gt (Venkata)
- Do not define dummy vma in stack (Venkata)
- Take pinning into account in __i915_gem_object_is_lmem (Matt B, Thomas)
- Do not report currently active engine when describing objects (Tvrtko)
- Fix pdfdocs build error by removing nested grid from GuC docs (Akira)
- Remove false warning from the rps worker (Tejas)
- Flush buffer pools on driver remove (Janusz)
- Fix runtime pm handling in i915_gem_shrink (Maarten)
- Rework TTM object initialization slightly (Thomas)
- Use fixed offset for PTEs location (Michal Wa)
- Verify result from CTB (de)register action and improve error messages (Michal Wa)
- Fix bug in user proto-context creation that leaked contexts (Matt B)

- Re-use Gen11 forcewake read functions on Gen12 (Matt R)
- Make shadow tables range-based (Matt R)
- Ditch the i915_gem_ww_ctx loop member (Thomas, Maarten)
- Use NULL instead of 0 where appropriate (Ville)
- Rename pci/debugfs functions to respect file prefix (Jani, Lucas)
- Drop guc_communication_enabled (Daniele)
- Selftest fixes (Thomas, Daniel, Matt A, Maarten)
- Clean up inconsistent indenting (Colin)
- Use direction definition DMA_BIDIRECTIONAL instead of
  PCI_DMA_BIDIRECTIONAL (Cai)
- Add "intel_" as prefix in set_mocs_index() (Ayaz)

From: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/YWAO80MB2eyToYoy@jlahtine-mobl.ger.corp.intel.com
Signed-off-by: Dave Airlie <airlied@redhat.com>
This commit is contained in:
Dave Airlie 2021-10-11 18:09:39 +10:00
commit 1176d15f0f
167 changed files with 5486 additions and 1947 deletions

View File

@ -471,6 +471,14 @@ Object Tiling IOCTLs
.. kernel-doc:: drivers/gpu/drm/i915/gem/i915_gem_tiling.c
:doc: buffer object tiling
Protected Objects
-----------------
.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp.c
:doc: PXP
.. kernel-doc:: drivers/gpu/drm/i915/pxp/intel_pxp_types.h
Microcontrollers
================
@ -495,6 +503,8 @@ GuC
.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/intel_guc.c
:doc: GuC
.. kernel-doc:: drivers/gpu/drm/i915/gt/uc/intel_guc.h
GuC Firmware Layout
~~~~~~~~~~~~~~~~~~~

View File

@ -248,7 +248,7 @@ static inline int modeset_lock(struct drm_modeset_lock *lock,
if (ctx->trylock_only) {
lockdep_assert_held(&ctx->ww_ctx);
if (!ww_mutex_trylock(&lock->mutex))
if (!ww_mutex_trylock(&lock->mutex, NULL))
return -EBUSY;
else
return 0;

View File

@ -131,6 +131,17 @@ config DRM_I915_GVT_KVMGT
Choose this option if you want to enable KVMGT support for
Intel GVT-g.
config DRM_I915_PXP
bool "Enable Intel PXP support for Intel Gen12 and newer platform"
depends on DRM_I915
depends on INTEL_MEI && INTEL_MEI_PXP
default n
help
PXP (Protected Xe Path) is an i915 component, available on GEN12 and
newer GPUs, that helps to establish the hardware protected session and
manage the status of the alive software session, as well as its life
cycle.
menu "drm/i915 Debugging"
depends on DRM_I915
depends on EXPERT

View File

@ -13,13 +13,11 @@
# will most likely get a sudden build breakage... Hopefully we will fix
# new warnings before CI updates!
subdir-ccflags-y := -Wall -Wextra
subdir-ccflags-y += $(call cc-disable-warning, unused-parameter)
subdir-ccflags-y += $(call cc-disable-warning, type-limits)
subdir-ccflags-y += $(call cc-disable-warning, missing-field-initializers)
subdir-ccflags-y += -Wno-unused-parameter
subdir-ccflags-y += -Wno-type-limits
subdir-ccflags-y += -Wno-missing-field-initializers
subdir-ccflags-y += -Wno-sign-compare
subdir-ccflags-y += $(call cc-disable-warning, unused-but-set-variable)
# clang warnings
subdir-ccflags-y += $(call cc-disable-warning, sign-compare)
subdir-ccflags-y += $(call cc-disable-warning, initializer-overrides)
subdir-ccflags-y += $(call cc-disable-warning, frame-address)
subdir-ccflags-$(CONFIG_DRM_I915_WERROR) += -Werror
@ -78,9 +76,6 @@ i915-$(CONFIG_PERF_EVENTS) += i915_pmu.o
# "Graphics Technology" (aka we talk to the gpu)
gt-y += \
gt/debugfs_engines.o \
gt/debugfs_gt.o \
gt/debugfs_gt_pm.o \
gt/gen2_engine_cs.o \
gt/gen6_engine_cs.o \
gt/gen6_ppgtt.o \
@ -100,8 +95,11 @@ gt-y += \
gt/intel_gt.o \
gt/intel_gt_buffer_pool.o \
gt/intel_gt_clock_utils.o \
gt/intel_gt_debugfs.o \
gt/intel_gt_engines_debugfs.o \
gt/intel_gt_irq.o \
gt/intel_gt_pm.o \
gt/intel_gt_pm_debugfs.o \
gt/intel_gt_pm_irq.o \
gt/intel_gt_requests.o \
gt/intel_gtt.o \
@ -154,6 +152,7 @@ gem-y += \
gem/i915_gem_throttle.o \
gem/i915_gem_tiling.o \
gem/i915_gem_ttm.o \
gem/i915_gem_ttm_pm.o \
gem/i915_gem_userptr.o \
gem/i915_gem_wait.o \
gem/i915_gemfs.o
@ -280,6 +279,16 @@ i915-y += \
i915-y += i915_perf.o
# Protected execution platform (PXP) support
i915-$(CONFIG_DRM_I915_PXP) += \
pxp/intel_pxp.o \
pxp/intel_pxp_cmd.o \
pxp/intel_pxp_debugfs.o \
pxp/intel_pxp_irq.o \
pxp/intel_pxp_pm.o \
pxp/intel_pxp_session.o \
pxp/intel_pxp_tee.o
# Post-mortem debug and GPU hang state capture
i915-$(CONFIG_DRM_I915_CAPTURE_ERROR) += i915_gpu_error.o
i915-$(CONFIG_DRM_I915_SELFTEST) += \

View File

@ -71,6 +71,8 @@
#include "gt/intel_rps.h"
#include "gt/gen8_ppgtt.h"
#include "pxp/intel_pxp.h"
#include "g4x_dp.h"
#include "g4x_hdmi.h"
#include "i915_drv.h"
@ -8987,13 +8989,28 @@ static int intel_bigjoiner_add_affected_planes(struct intel_atomic_state *state)
return 0;
}
static bool bo_has_valid_encryption(struct drm_i915_gem_object *obj)
{
struct drm_i915_private *i915 = to_i915(obj->base.dev);
return intel_pxp_key_check(&i915->gt.pxp, obj, false) == 0;
}
static bool pxp_is_borked(struct drm_i915_gem_object *obj)
{
return i915_gem_object_is_protected(obj) && !bo_has_valid_encryption(obj);
}
static int intel_atomic_check_planes(struct intel_atomic_state *state)
{
struct drm_i915_private *dev_priv = to_i915(state->base.dev);
struct intel_crtc_state *old_crtc_state, *new_crtc_state;
struct intel_plane_state *plane_state;
struct intel_plane *plane;
struct intel_plane_state *new_plane_state;
struct intel_plane_state *old_plane_state;
struct intel_crtc *crtc;
const struct drm_framebuffer *fb;
int i, ret;
ret = icl_add_linked_planes(state);
@ -9041,6 +9058,19 @@ static int intel_atomic_check_planes(struct intel_atomic_state *state)
return ret;
}
for_each_new_intel_plane_in_state(state, plane, plane_state, i) {
new_plane_state = intel_atomic_get_new_plane_state(state, plane);
old_plane_state = intel_atomic_get_old_plane_state(state, plane);
fb = new_plane_state->hw.fb;
if (fb) {
new_plane_state->decrypt = bo_has_valid_encryption(intel_fb_obj(fb));
new_plane_state->force_black = pxp_is_borked(intel_fb_obj(fb));
} else {
new_plane_state->decrypt = old_plane_state->decrypt;
new_plane_state->force_black = old_plane_state->force_black;
}
}
return 0;
}
@ -9327,6 +9357,10 @@ static int intel_atomic_check_async(struct intel_atomic_state *state)
drm_dbg_kms(&i915->drm, "Color range cannot be changed in async flip\n");
return -EINVAL;
}
/* plane decryption is allow to change only in synchronous flips */
if (old_plane_state->decrypt != new_plane_state->decrypt)
return -EINVAL;
}
return 0;

View File

@ -626,6 +626,12 @@ struct intel_plane_state {
struct intel_fb_view view;
/* Plane pxp decryption state */
bool decrypt;
/* Plane state to display black pixels when pxp is borked */
bool force_black;
/* plane control register */
u32 ctl;

View File

@ -18,6 +18,7 @@
#include "intel_sprite.h"
#include "skl_scaler.h"
#include "skl_universal_plane.h"
#include "pxp/intel_pxp.h"
static const u32 skl_plane_formats[] = {
DRM_FORMAT_C8,
@ -1007,6 +1008,33 @@ static u32 skl_surf_address(const struct intel_plane_state *plane_state,
}
}
static void intel_load_plane_csc_black(struct intel_plane *intel_plane)
{
struct drm_i915_private *dev_priv = to_i915(intel_plane->base.dev);
enum pipe pipe = intel_plane->pipe;
enum plane_id plane = intel_plane->id;
u16 postoff = 0;
drm_dbg_kms(&dev_priv->drm, "plane color CTM to black %s:%d\n",
intel_plane->base.name, plane);
intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 0), 0);
intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 1), 0);
intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 2), 0);
intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 3), 0);
intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 4), 0);
intel_de_write_fw(dev_priv, PLANE_CSC_COEFF(pipe, plane, 5), 0);
intel_de_write_fw(dev_priv, PLANE_CSC_PREOFF(pipe, plane, 0), 0);
intel_de_write_fw(dev_priv, PLANE_CSC_PREOFF(pipe, plane, 1), 0);
intel_de_write_fw(dev_priv, PLANE_CSC_PREOFF(pipe, plane, 2), 0);
intel_de_write_fw(dev_priv, PLANE_CSC_POSTOFF(pipe, plane, 0), postoff);
intel_de_write_fw(dev_priv, PLANE_CSC_POSTOFF(pipe, plane, 1), postoff);
intel_de_write_fw(dev_priv, PLANE_CSC_POSTOFF(pipe, plane, 2), postoff);
}
static void
skl_program_plane(struct intel_plane *plane,
const struct intel_crtc_state *crtc_state,
@ -1030,7 +1058,7 @@ skl_program_plane(struct intel_plane *plane,
u8 alpha = plane_state->hw.alpha >> 8;
u32 plane_color_ctl = 0, aux_dist = 0;
unsigned long irqflags;
u32 keymsk, keymax;
u32 keymsk, keymax, plane_surf;
u32 plane_ctl = plane_state->ctl;
plane_ctl |= skl_plane_ctl_crtc(crtc_state);
@ -1118,8 +1146,23 @@ skl_program_plane(struct intel_plane *plane,
* the control register just before the surface register.
*/
intel_de_write_fw(dev_priv, PLANE_CTL(pipe, plane_id), plane_ctl);
intel_de_write_fw(dev_priv, PLANE_SURF(pipe, plane_id),
intel_plane_ggtt_offset(plane_state) + surf_addr);
plane_surf = intel_plane_ggtt_offset(plane_state) + surf_addr;
plane_color_ctl = intel_de_read_fw(dev_priv, PLANE_COLOR_CTL(pipe, plane_id));
/*
* FIXME: pxp session invalidation can hit any time even at time of commit
* or after the commit, display content will be garbage.
*/
if (plane_state->decrypt) {
plane_surf |= PLANE_SURF_DECRYPT;
} else if (plane_state->force_black) {
intel_load_plane_csc_black(plane);
plane_color_ctl |= PLANE_COLOR_PLANE_CSC_ENABLE;
}
intel_de_write_fw(dev_priv, PLANE_COLOR_CTL(pipe, plane_id),
plane_color_ctl);
intel_de_write_fw(dev_priv, PLANE_SURF(pipe, plane_id), plane_surf);
spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
}

View File

@ -77,6 +77,8 @@
#include "gt/intel_gpu_commands.h"
#include "gt/intel_ring.h"
#include "pxp/intel_pxp.h"
#include "i915_gem_context.h"
#include "i915_trace.h"
#include "i915_user_extensions.h"
@ -186,10 +188,13 @@ static int validate_priority(struct drm_i915_private *i915,
return 0;
}
static void proto_context_close(struct i915_gem_proto_context *pc)
static void proto_context_close(struct drm_i915_private *i915,
struct i915_gem_proto_context *pc)
{
int i;
if (pc->pxp_wakeref)
intel_runtime_pm_put(&i915->runtime_pm, pc->pxp_wakeref);
if (pc->vm)
i915_vm_put(pc->vm);
if (pc->user_engines) {
@ -241,6 +246,35 @@ static int proto_context_set_persistence(struct drm_i915_private *i915,
return 0;
}
static int proto_context_set_protected(struct drm_i915_private *i915,
struct i915_gem_proto_context *pc,
bool protected)
{
int ret = 0;
if (!protected) {
pc->uses_protected_content = false;
} else if (!intel_pxp_is_enabled(&i915->gt.pxp)) {
ret = -ENODEV;
} else if ((pc->user_flags & BIT(UCONTEXT_RECOVERABLE)) ||
!(pc->user_flags & BIT(UCONTEXT_BANNABLE))) {
ret = -EPERM;
} else {
pc->uses_protected_content = true;
/*
* protected context usage requires the PXP session to be up,
* which in turn requires the device to be active.
*/
pc->pxp_wakeref = intel_runtime_pm_get(&i915->runtime_pm);
if (!intel_pxp_is_active(&i915->gt.pxp))
ret = intel_pxp_start(&i915->gt.pxp);
}
return ret;
}
static struct i915_gem_proto_context *
proto_context_create(struct drm_i915_private *i915, unsigned int flags)
{
@ -269,7 +303,7 @@ proto_context_create(struct drm_i915_private *i915, unsigned int flags)
return pc;
proto_close:
proto_context_close(pc);
proto_context_close(i915, pc);
return err;
}
@ -442,6 +476,13 @@ set_proto_ctx_engines_bond(struct i915_user_extension __user *base, void *data)
u16 idx, num_bonds;
int err, n;
if (GRAPHICS_VER(i915) >= 12 && !IS_TIGERLAKE(i915) &&
!IS_ROCKETLAKE(i915) && !IS_ALDERLAKE_S(i915)) {
drm_dbg(&i915->drm,
"Bonding on gen12+ aside from TGL, RKL, and ADL_S not supported\n");
return -ENODEV;
}
if (get_user(idx, &ext->virtual_index))
return -EFAULT;
@ -686,6 +727,8 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
ret = -EPERM;
else if (args->value)
pc->user_flags |= BIT(UCONTEXT_BANNABLE);
else if (pc->uses_protected_content)
ret = -EPERM;
else
pc->user_flags &= ~BIT(UCONTEXT_BANNABLE);
break;
@ -693,10 +736,12 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
case I915_CONTEXT_PARAM_RECOVERABLE:
if (args->size)
ret = -EINVAL;
else if (args->value)
pc->user_flags |= BIT(UCONTEXT_RECOVERABLE);
else
else if (!args->value)
pc->user_flags &= ~BIT(UCONTEXT_RECOVERABLE);
else if (pc->uses_protected_content)
ret = -EPERM;
else
pc->user_flags |= BIT(UCONTEXT_RECOVERABLE);
break;
case I915_CONTEXT_PARAM_PRIORITY:
@ -724,6 +769,11 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
args->value);
break;
case I915_CONTEXT_PARAM_PROTECTED_CONTENT:
ret = proto_context_set_protected(fpriv->dev_priv, pc,
args->value);
break;
case I915_CONTEXT_PARAM_NO_ZEROMAP:
case I915_CONTEXT_PARAM_BAN_PERIOD:
case I915_CONTEXT_PARAM_RINGSIZE:
@ -735,44 +785,6 @@ static int set_proto_ctx_param(struct drm_i915_file_private *fpriv,
return ret;
}
static struct i915_address_space *
context_get_vm_rcu(struct i915_gem_context *ctx)
{
GEM_BUG_ON(!rcu_access_pointer(ctx->vm));
do {
struct i915_address_space *vm;
/*
* We do not allow downgrading from full-ppgtt [to a shared
* global gtt], so ctx->vm cannot become NULL.
*/
vm = rcu_dereference(ctx->vm);
if (!kref_get_unless_zero(&vm->ref))
continue;
/*
* This ppgtt may have be reallocated between
* the read and the kref, and reassigned to a third
* context. In order to avoid inadvertent sharing
* of this ppgtt with that third context (and not
* src), we have to confirm that we have the same
* ppgtt after passing through the strong memory
* barrier implied by a successful
* kref_get_unless_zero().
*
* Once we have acquired the current ppgtt of ctx,
* we no longer care if it is released from ctx, as
* it cannot be reallocated elsewhere.
*/
if (vm == rcu_access_pointer(ctx->vm))
return rcu_pointer_handoff(vm);
i915_vm_put(vm);
} while (1);
}
static int intel_context_set_gem(struct intel_context *ce,
struct i915_gem_context *ctx,
struct intel_sseu sseu)
@ -784,23 +796,15 @@ static int intel_context_set_gem(struct intel_context *ce,
ce->ring_size = SZ_16K;
if (rcu_access_pointer(ctx->vm)) {
struct i915_address_space *vm;
rcu_read_lock();
vm = context_get_vm_rcu(ctx); /* hmm */
rcu_read_unlock();
i915_vm_put(ce->vm);
ce->vm = vm;
}
i915_vm_put(ce->vm);
ce->vm = i915_gem_context_get_eb_vm(ctx);
if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
intel_engine_has_timeslices(ce->engine) &&
intel_engine_has_semaphores(ce->engine))
__set_bit(CONTEXT_USE_SEMAPHORES, &ce->flags);
if (IS_ACTIVE(CONFIG_DRM_I915_REQUEST_TIMEOUT) &&
if (CONFIG_DRM_I915_REQUEST_TIMEOUT &&
ctx->i915->params.request_timeout_ms) {
unsigned int timeout_ms = ctx->i915->params.request_timeout_ms;
@ -937,6 +941,10 @@ static struct i915_gem_engines *user_engines(struct i915_gem_context *ctx,
unsigned int n;
e = alloc_engines(num_engines);
if (!e)
return ERR_PTR(-ENOMEM);
e->num_engines = num_engines;
for (n = 0; n < num_engines; n++) {
struct intel_context *ce;
int ret;
@ -970,7 +978,6 @@ static struct i915_gem_engines *user_engines(struct i915_gem_context *ctx,
goto free_engines;
}
}
e->num_engines = num_engines;
return e;
@ -979,9 +986,11 @@ static struct i915_gem_engines *user_engines(struct i915_gem_context *ctx,
return err;
}
void i915_gem_context_release(struct kref *ref)
static void i915_gem_context_release_work(struct work_struct *work)
{
struct i915_gem_context *ctx = container_of(ref, typeof(*ctx), ref);
struct i915_gem_context *ctx = container_of(work, typeof(*ctx),
release_work);
struct i915_address_space *vm;
trace_i915_context_free(ctx);
GEM_BUG_ON(!i915_gem_context_is_closed(ctx));
@ -989,6 +998,13 @@ void i915_gem_context_release(struct kref *ref)
if (ctx->syncobj)
drm_syncobj_put(ctx->syncobj);
vm = ctx->vm;
if (vm)
i915_vm_put(vm);
if (ctx->pxp_wakeref)
intel_runtime_pm_put(&ctx->i915->runtime_pm, ctx->pxp_wakeref);
mutex_destroy(&ctx->engines_mutex);
mutex_destroy(&ctx->lut_mutex);
@ -998,6 +1014,13 @@ void i915_gem_context_release(struct kref *ref)
kfree_rcu(ctx, rcu);
}
void i915_gem_context_release(struct kref *ref)
{
struct i915_gem_context *ctx = container_of(ref, typeof(*ctx), ref);
queue_work(ctx->i915->wq, &ctx->release_work);
}
static inline struct i915_gem_engines *
__context_engines_static(const struct i915_gem_context *ctx)
{
@ -1204,9 +1227,16 @@ static void context_close(struct i915_gem_context *ctx)
set_closed_name(ctx);
vm = i915_gem_context_vm(ctx);
if (vm)
vm = ctx->vm;
if (vm) {
/* i915_vm_close drops the final reference, which is a bit too
* early and could result in surprises with concurrent
* operations racing with thist ctx close. Keep a full reference
* until the end.
*/
i915_vm_get(vm);
i915_vm_close(vm);
}
ctx->file_priv = ERR_PTR(-EBADF);
@ -1277,49 +1307,6 @@ static int __context_set_persistence(struct i915_gem_context *ctx, bool state)
return 0;
}
static inline struct i915_gem_engines *
__context_engines_await(const struct i915_gem_context *ctx,
bool *user_engines)
{
struct i915_gem_engines *engines;
rcu_read_lock();
do {
engines = rcu_dereference(ctx->engines);
GEM_BUG_ON(!engines);
if (user_engines)
*user_engines = i915_gem_context_user_engines(ctx);
/* successful await => strong mb */
if (unlikely(!i915_sw_fence_await(&engines->fence)))
continue;
if (likely(engines == rcu_access_pointer(ctx->engines)))
break;
i915_sw_fence_complete(&engines->fence);
} while (1);
rcu_read_unlock();
return engines;
}
static void
context_apply_all(struct i915_gem_context *ctx,
void (*fn)(struct intel_context *ce, void *data),
void *data)
{
struct i915_gem_engines_iter it;
struct i915_gem_engines *e;
struct intel_context *ce;
e = __context_engines_await(ctx, NULL);
for_each_gem_engine(ce, e, it)
fn(ce, data);
i915_sw_fence_complete(&e->fence);
}
static struct i915_gem_context *
i915_gem_create_context(struct drm_i915_private *i915,
const struct i915_gem_proto_context *pc)
@ -1339,6 +1326,7 @@ i915_gem_create_context(struct drm_i915_private *i915,
ctx->sched = pc->sched;
mutex_init(&ctx->mutex);
INIT_LIST_HEAD(&ctx->link);
INIT_WORK(&ctx->release_work, i915_gem_context_release_work);
spin_lock_init(&ctx->stale.lock);
INIT_LIST_HEAD(&ctx->stale.engines);
@ -1348,7 +1336,7 @@ i915_gem_create_context(struct drm_i915_private *i915,
} else if (HAS_FULL_PPGTT(i915)) {
struct i915_ppgtt *ppgtt;
ppgtt = i915_ppgtt_create(&i915->gt);
ppgtt = i915_ppgtt_create(&i915->gt, 0);
if (IS_ERR(ppgtt)) {
drm_dbg(&i915->drm, "PPGTT setup failed (%ld)\n",
PTR_ERR(ppgtt));
@ -1358,7 +1346,7 @@ i915_gem_create_context(struct drm_i915_private *i915,
vm = &ppgtt->vm;
}
if (vm) {
RCU_INIT_POINTER(ctx->vm, i915_vm_open(vm));
ctx->vm = i915_vm_open(vm);
/* i915_vm_open() takes a reference */
i915_vm_put(vm);
@ -1399,6 +1387,11 @@ i915_gem_create_context(struct drm_i915_private *i915,
goto err_engines;
}
if (pc->uses_protected_content) {
ctx->pxp_wakeref = intel_runtime_pm_get(&i915->runtime_pm);
ctx->uses_protected_content = true;
}
trace_i915_context_create(ctx);
return ctx;
@ -1470,7 +1463,7 @@ int i915_gem_context_open(struct drm_i915_private *i915,
}
ctx = i915_gem_create_context(i915, pc);
proto_context_close(pc);
proto_context_close(i915, pc);
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
goto err;
@ -1497,7 +1490,7 @@ void i915_gem_context_close(struct drm_file *file)
unsigned long idx;
xa_for_each(&file_priv->proto_context_xa, idx, pc)
proto_context_close(pc);
proto_context_close(file_priv->dev_priv, pc);
xa_destroy(&file_priv->proto_context_xa);
mutex_destroy(&file_priv->proto_context_lock);
@ -1526,7 +1519,7 @@ int i915_gem_vm_create_ioctl(struct drm_device *dev, void *data,
if (args->flags)
return -EINVAL;
ppgtt = i915_ppgtt_create(&i915->gt);
ppgtt = i915_ppgtt_create(&i915->gt, 0);
if (IS_ERR(ppgtt))
return PTR_ERR(ppgtt);
@ -1581,18 +1574,15 @@ static int get_ppgtt(struct drm_i915_file_private *file_priv,
int err;
u32 id;
if (!rcu_access_pointer(ctx->vm))
if (!i915_gem_context_has_full_ppgtt(ctx))
return -ENODEV;
rcu_read_lock();
vm = context_get_vm_rcu(ctx);
rcu_read_unlock();
if (!vm)
return -ENODEV;
vm = ctx->vm;
GEM_BUG_ON(!vm);
err = xa_alloc(&file_priv->vm_xa, &id, vm, xa_limit_32b, GFP_KERNEL);
if (err)
goto err_put;
return err;
i915_vm_open(vm);
@ -1600,8 +1590,6 @@ static int get_ppgtt(struct drm_i915_file_private *file_priv,
args->value = id;
args->size = 0;
err_put:
i915_vm_put(vm);
return err;
}
@ -1769,23 +1757,11 @@ set_persistence(struct i915_gem_context *ctx,
return __context_set_persistence(ctx, args->value);
}
static void __apply_priority(struct intel_context *ce, void *arg)
{
struct i915_gem_context *ctx = arg;
if (!intel_engine_has_timeslices(ce->engine))
return;
if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
intel_engine_has_semaphores(ce->engine))
intel_context_set_use_semaphores(ce);
else
intel_context_clear_use_semaphores(ce);
}
static int set_priority(struct i915_gem_context *ctx,
const struct drm_i915_gem_context_param *args)
{
struct i915_gem_engines_iter it;
struct intel_context *ce;
int err;
err = validate_priority(ctx->i915, args);
@ -1793,7 +1769,27 @@ static int set_priority(struct i915_gem_context *ctx,
return err;
ctx->sched.priority = args->value;
context_apply_all(ctx, __apply_priority, ctx);
for_each_gem_engine(ce, i915_gem_context_lock_engines(ctx), it) {
if (!intel_engine_has_timeslices(ce->engine))
continue;
if (ctx->sched.priority >= I915_PRIORITY_NORMAL &&
intel_engine_has_semaphores(ce->engine))
intel_context_set_use_semaphores(ce);
else
intel_context_clear_use_semaphores(ce);
}
i915_gem_context_unlock_engines(ctx);
return 0;
}
static int get_protected(struct i915_gem_context *ctx,
struct drm_i915_gem_context_param *args)
{
args->size = 0;
args->value = i915_gem_context_uses_protected_content(ctx);
return 0;
}
@ -1821,6 +1817,8 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
ret = -EPERM;
else if (args->value)
i915_gem_context_set_bannable(ctx);
else if (i915_gem_context_uses_protected_content(ctx))
ret = -EPERM; /* can't clear this for protected contexts */
else
i915_gem_context_clear_bannable(ctx);
break;
@ -1828,10 +1826,12 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
case I915_CONTEXT_PARAM_RECOVERABLE:
if (args->size)
ret = -EINVAL;
else if (args->value)
i915_gem_context_set_recoverable(ctx);
else
else if (!args->value)
i915_gem_context_clear_recoverable(ctx);
else if (i915_gem_context_uses_protected_content(ctx))
ret = -EPERM; /* can't set this for protected contexts */
else
i915_gem_context_set_recoverable(ctx);
break;
case I915_CONTEXT_PARAM_PRIORITY:
@ -1846,6 +1846,7 @@ static int ctx_setparam(struct drm_i915_file_private *fpriv,
ret = set_persistence(ctx, args);
break;
case I915_CONTEXT_PARAM_PROTECTED_CONTENT:
case I915_CONTEXT_PARAM_NO_ZEROMAP:
case I915_CONTEXT_PARAM_BAN_PERIOD:
case I915_CONTEXT_PARAM_RINGSIZE:
@ -1924,7 +1925,7 @@ finalize_create_context_locked(struct drm_i915_file_private *file_priv,
old = xa_erase(&file_priv->proto_context_xa, id);
GEM_BUG_ON(old != pc);
proto_context_close(pc);
proto_context_close(file_priv->dev_priv, pc);
/* One for the xarray and one for the caller */
return i915_gem_context_get(ctx);
@ -2010,7 +2011,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
goto err_pc;
}
proto_context_close(ext_data.pc);
proto_context_close(i915, ext_data.pc);
gem_context_register(ctx, ext_data.fpriv, id);
} else {
ret = proto_context_register(ext_data.fpriv, ext_data.pc, &id);
@ -2024,7 +2025,7 @@ int i915_gem_context_create_ioctl(struct drm_device *dev, void *data,
return 0;
err_pc:
proto_context_close(ext_data.pc);
proto_context_close(i915, ext_data.pc);
return ret;
}
@ -2055,7 +2056,7 @@ int i915_gem_context_destroy_ioctl(struct drm_device *dev, void *data,
GEM_WARN_ON(ctx && pc);
if (pc)
proto_context_close(pc);
proto_context_close(file_priv->dev_priv, pc);
if (ctx)
context_close(ctx);
@ -2124,6 +2125,7 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
struct drm_i915_file_private *file_priv = file->driver_priv;
struct drm_i915_gem_context_param *args = data;
struct i915_gem_context *ctx;
struct i915_address_space *vm;
int ret = 0;
ctx = i915_gem_context_lookup(file_priv, args->ctx_id);
@ -2133,12 +2135,10 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
switch (args->param) {
case I915_CONTEXT_PARAM_GTT_SIZE:
args->size = 0;
rcu_read_lock();
if (rcu_access_pointer(ctx->vm))
args->value = rcu_dereference(ctx->vm)->total;
else
args->value = to_i915(dev)->ggtt.vm.total;
rcu_read_unlock();
vm = i915_gem_context_get_eb_vm(ctx);
args->value = vm->total;
i915_vm_put(vm);
break;
case I915_CONTEXT_PARAM_NO_ERROR_CAPTURE:
@ -2174,6 +2174,10 @@ int i915_gem_context_getparam_ioctl(struct drm_device *dev, void *data,
args->value = i915_gem_context_is_persistent(ctx);
break;
case I915_CONTEXT_PARAM_PROTECTED_CONTENT:
ret = get_protected(ctx, args);
break;
case I915_CONTEXT_PARAM_NO_ZEROMAP:
case I915_CONTEXT_PARAM_BAN_PERIOD:
case I915_CONTEXT_PARAM_ENGINES:

View File

@ -108,6 +108,12 @@ i915_gem_context_clear_user_engines(struct i915_gem_context *ctx)
clear_bit(CONTEXT_USER_ENGINES, &ctx->flags);
}
static inline bool
i915_gem_context_uses_protected_content(const struct i915_gem_context *ctx)
{
return ctx->uses_protected_content;
}
/* i915_gem_context.c */
void i915_gem_init__contexts(struct drm_i915_private *i915);
@ -154,17 +160,22 @@ i915_gem_context_vm(struct i915_gem_context *ctx)
return rcu_dereference_protected(ctx->vm, lockdep_is_held(&ctx->mutex));
}
static inline bool i915_gem_context_has_full_ppgtt(struct i915_gem_context *ctx)
{
GEM_BUG_ON(!!ctx->vm != HAS_FULL_PPGTT(ctx->i915));
return !!ctx->vm;
}
static inline struct i915_address_space *
i915_gem_context_get_vm_rcu(struct i915_gem_context *ctx)
i915_gem_context_get_eb_vm(struct i915_gem_context *ctx)
{
struct i915_address_space *vm;
rcu_read_lock();
vm = rcu_dereference(ctx->vm);
vm = ctx->vm;
if (!vm)
vm = &ctx->i915->ggtt.vm;
vm = i915_vm_get(vm);
rcu_read_unlock();
return vm;
}

View File

@ -198,6 +198,12 @@ struct i915_gem_proto_context {
/** @single_timeline: See See &i915_gem_context.syncobj */
bool single_timeline;
/** @uses_protected_content: See &i915_gem_context.uses_protected_content */
bool uses_protected_content;
/** @pxp_wakeref: See &i915_gem_context.pxp_wakeref */
intel_wakeref_t pxp_wakeref;
};
/**
@ -262,7 +268,7 @@ struct i915_gem_context {
* In other modes, this is a NULL pointer with the expectation that
* the caller uses the shared global GTT.
*/
struct i915_address_space __rcu *vm;
struct i915_address_space *vm;
/**
* @pid: process id of creator
@ -288,6 +294,18 @@ struct i915_gem_context {
*/
struct kref ref;
/**
* @release_work:
*
* Work item for deferred cleanup, since i915_gem_context_put() tends to
* be called from hardirq context.
*
* FIXME: The only real reason for this is &i915_gem_engines.fence, all
* other callers are from process context and need at most some mild
* shuffling to pull the i915_gem_context_put() call out of a spinlock.
*/
struct work_struct release_work;
/**
* @rcu: rcu_head for deferred freeing.
*/
@ -309,6 +327,28 @@ struct i915_gem_context {
#define CONTEXT_CLOSED 0
#define CONTEXT_USER_ENGINES 1
/**
* @uses_protected_content: context uses PXP-encrypted objects.
*
* This flag can only be set at ctx creation time and it's immutable for
* the lifetime of the context. See I915_CONTEXT_PARAM_PROTECTED_CONTENT
* in uapi/drm/i915_drm.h for more info on setting restrictions and
* expected behaviour of marked contexts.
*/
bool uses_protected_content;
/**
* @pxp_wakeref: wakeref to keep the device awake when PXP is in use
*
* PXP sessions are invalidated when the device is suspended, which in
* turns invalidates all contexts and objects using it. To keep the
* flow simple, we keep the device awake when contexts using PXP objects
* are in use. It is expected that the userspace application only uses
* PXP when the display is on, so taking a wakeref here shouldn't worsen
* our power metrics.
*/
intel_wakeref_t pxp_wakeref;
/** @mutex: guards everything that isn't engines or handles_vma */
struct mutex mutex;

View File

@ -6,6 +6,7 @@
#include "gem/i915_gem_ioctls.h"
#include "gem/i915_gem_lmem.h"
#include "gem/i915_gem_region.h"
#include "pxp/intel_pxp.h"
#include "i915_drv.h"
#include "i915_trace.h"
@ -82,21 +83,11 @@ static int i915_gem_publish(struct drm_i915_gem_object *obj,
return 0;
}
/**
* Creates a new object using the same path as DRM_I915_GEM_CREATE_EXT
* @i915: i915 private
* @size: size of the buffer, in bytes
* @placements: possible placement regions, in priority order
* @n_placements: number of possible placement regions
*
* This function is exposed primarily for selftests and does very little
* error checking. It is assumed that the set of placement regions has
* already been verified to be valid.
*/
struct drm_i915_gem_object *
__i915_gem_object_create_user(struct drm_i915_private *i915, u64 size,
struct intel_memory_region **placements,
unsigned int n_placements)
static struct drm_i915_gem_object *
__i915_gem_object_create_user_ext(struct drm_i915_private *i915, u64 size,
struct intel_memory_region **placements,
unsigned int n_placements,
unsigned int ext_flags)
{
struct intel_memory_region *mr = placements[0];
struct drm_i915_gem_object *obj;
@ -135,6 +126,9 @@ __i915_gem_object_create_user(struct drm_i915_private *i915, u64 size,
GEM_BUG_ON(size != obj->base.size);
/* Add any flag set by create_ext options */
obj->flags |= ext_flags;
trace_i915_gem_object_create(obj);
return obj;
@ -145,6 +139,26 @@ __i915_gem_object_create_user(struct drm_i915_private *i915, u64 size,
return ERR_PTR(ret);
}
/**
* Creates a new object using the same path as DRM_I915_GEM_CREATE_EXT
* @i915: i915 private
* @size: size of the buffer, in bytes
* @placements: possible placement regions, in priority order
* @n_placements: number of possible placement regions
*
* This function is exposed primarily for selftests and does very little
* error checking. It is assumed that the set of placement regions has
* already been verified to be valid.
*/
struct drm_i915_gem_object *
__i915_gem_object_create_user(struct drm_i915_private *i915, u64 size,
struct intel_memory_region **placements,
unsigned int n_placements)
{
return __i915_gem_object_create_user_ext(i915, size, placements,
n_placements, 0);
}
int
i915_gem_dumb_create(struct drm_file *file,
struct drm_device *dev,
@ -224,6 +238,7 @@ struct create_ext {
struct drm_i915_private *i915;
struct intel_memory_region *placements[INTEL_REGION_UNKNOWN];
unsigned int n_placements;
unsigned long flags;
};
static void repr_placements(char *buf, size_t size,
@ -347,17 +362,34 @@ static int ext_set_placements(struct i915_user_extension __user *base,
{
struct drm_i915_gem_create_ext_memory_regions ext;
if (!IS_ENABLED(CONFIG_DRM_I915_UNSTABLE_FAKE_LMEM))
return -ENODEV;
if (copy_from_user(&ext, base, sizeof(ext)))
return -EFAULT;
return set_placements(&ext, data);
}
static int ext_set_protected(struct i915_user_extension __user *base, void *data)
{
struct drm_i915_gem_create_ext_protected_content ext;
struct create_ext *ext_data = data;
if (copy_from_user(&ext, base, sizeof(ext)))
return -EFAULT;
if (ext.flags)
return -EINVAL;
if (!intel_pxp_is_enabled(&ext_data->i915->gt.pxp))
return -ENODEV;
ext_data->flags |= I915_BO_PROTECTED;
return 0;
}
static const i915_user_extension_fn create_extensions[] = {
[I915_GEM_CREATE_EXT_MEMORY_REGIONS] = ext_set_placements,
[I915_GEM_CREATE_EXT_PROTECTED_CONTENT] = ext_set_protected,
};
/**
@ -392,9 +424,10 @@ i915_gem_create_ext_ioctl(struct drm_device *dev, void *data,
ext_data.n_placements = 1;
}
obj = __i915_gem_object_create_user(i915, args->size,
ext_data.placements,
ext_data.n_placements);
obj = __i915_gem_object_create_user_ext(i915, args->size,
ext_data.placements,
ext_data.n_placements,
ext_data.flags);
if (IS_ERR(obj))
return PTR_ERR(obj);

View File

@ -21,6 +21,8 @@
#include "gt/intel_gt_pm.h"
#include "gt/intel_ring.h"
#include "pxp/intel_pxp.h"
#include "i915_drv.h"
#include "i915_gem_clflush.h"
#include "i915_gem_context.h"
@ -733,7 +735,7 @@ static int eb_select_context(struct i915_execbuffer *eb)
return PTR_ERR(ctx);
eb->gem_context = ctx;
if (rcu_access_pointer(ctx->vm))
if (i915_gem_context_has_full_ppgtt(ctx))
eb->invalid_flags |= EXEC_OBJECT_NEEDS_GTT;
return 0;
@ -759,11 +761,7 @@ static int __eb_add_lut(struct i915_execbuffer *eb,
/* Check that the context hasn't been closed in the meantime */
err = -EINTR;
if (!mutex_lock_interruptible(&ctx->lut_mutex)) {
struct i915_address_space *vm = rcu_access_pointer(ctx->vm);
if (unlikely(vm && vma->vm != vm))
err = -EAGAIN; /* user racing with ctx set-vm */
else if (likely(!i915_gem_context_is_closed(ctx)))
if (likely(!i915_gem_context_is_closed(ctx)))
err = radix_tree_insert(&ctx->handles_vma, handle, vma);
else
err = -ENOENT;
@ -814,6 +812,22 @@ static struct i915_vma *eb_lookup_vma(struct i915_execbuffer *eb, u32 handle)
if (unlikely(!obj))
return ERR_PTR(-ENOENT);
/*
* If the user has opted-in for protected-object tracking, make
* sure the object encryption can be used.
* We only need to do this when the object is first used with
* this context, because the context itself will be banned when
* the protected objects become invalid.
*/
if (i915_gem_context_uses_protected_content(eb->gem_context) &&
i915_gem_object_is_protected(obj)) {
err = intel_pxp_key_check(&vm->gt->pxp, obj, true);
if (err) {
i915_gem_object_put(obj);
return ERR_PTR(err);
}
}
vma = i915_vma_instance(obj, vm, NULL);
if (IS_ERR(vma)) {
i915_gem_object_put(obj);

View File

@ -56,8 +56,8 @@ bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
* @obj: The object to check.
*
* This function is intended to be called from within the fence signaling
* path where the fence keeps the object from being migrated. For example
* during gpu reset or similar.
* path where the fence, or a pin, keeps the object from being migrated. For
* example during gpu reset or similar.
*
* Return: Whether the object is resident in lmem.
*/
@ -66,7 +66,8 @@ bool __i915_gem_object_is_lmem(struct drm_i915_gem_object *obj)
struct intel_memory_region *mr = READ_ONCE(obj->mm.region);
#ifdef CONFIG_LOCKDEP
GEM_WARN_ON(dma_resv_test_signaled(obj->base.resv, true));
GEM_WARN_ON(dma_resv_test_signaled(obj->base.resv, true) &&
i915_gem_object_evictable(obj));
#endif
return mr && (mr->type == INTEL_MEMORY_LOCAL ||
mr->type == INTEL_MEMORY_STOLEN_LOCAL);
@ -103,6 +104,32 @@ __i915_gem_object_create_lmem_with_ps(struct drm_i915_private *i915,
size, page_size, flags);
}
struct drm_i915_gem_object *
i915_gem_object_create_lmem_from_data(struct drm_i915_private *i915,
const void *data, size_t size)
{
struct drm_i915_gem_object *obj;
void *map;
obj = i915_gem_object_create_lmem(i915,
round_up(size, PAGE_SIZE),
I915_BO_ALLOC_CONTIGUOUS);
if (IS_ERR(obj))
return obj;
map = i915_gem_object_pin_map_unlocked(obj, I915_MAP_WC);
if (IS_ERR(map)) {
i915_gem_object_put(obj);
return map;
}
memcpy(map, data, size);
i915_gem_object_unpin_map(obj);
return obj;
}
struct drm_i915_gem_object *
i915_gem_object_create_lmem(struct drm_i915_private *i915,
resource_size_t size,

View File

@ -23,6 +23,10 @@ bool i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
bool __i915_gem_object_is_lmem(struct drm_i915_gem_object *obj);
struct drm_i915_gem_object *
i915_gem_object_create_lmem_from_data(struct drm_i915_private *i915,
const void *data, size_t size);
struct drm_i915_gem_object *
__i915_gem_object_create_lmem_with_ps(struct drm_i915_private *i915,
resource_size_t size,

View File

@ -395,7 +395,7 @@ static vm_fault_t vm_fault_gtt(struct vm_fault *vmf)
/* Track the mmo associated with the fenced vma */
vma->mmo = mmo;
if (IS_ACTIVE(CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND))
if (CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND)
intel_wakeref_auto(&i915->ggtt.userfault_wakeref,
msecs_to_jiffies_timeout(CONFIG_DRM_I915_USERFAULT_AUTOSUSPEND));

View File

@ -25,6 +25,7 @@
#include <linux/sched/mm.h>
#include "display/intel_frontbuffer.h"
#include "pxp/intel_pxp.h"
#include "i915_drv.h"
#include "i915_gem_clflush.h"
#include "i915_gem_context.h"
@ -89,6 +90,22 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
mutex_init(&obj->mm.get_dma_page.lock);
}
/**
* i915_gem_object_fini - Clean up a GEM object initialization
* @obj: The gem object to cleanup
*
* This function cleans up gem object fields that are set up by
* drm_gem_private_object_init() and i915_gem_object_init().
* It's primarily intended as a helper for backends that need to
* clean up the gem object in separate steps.
*/
void __i915_gem_object_fini(struct drm_i915_gem_object *obj)
{
mutex_destroy(&obj->mm.get_page.lock);
mutex_destroy(&obj->mm.get_dma_page.lock);
dma_resv_fini(&obj->base._resv);
}
/**
* Mark up the object's coherency levels for a given cache_level
* @obj: #drm_i915_gem_object
@ -174,7 +191,6 @@ void __i915_gem_free_object_rcu(struct rcu_head *head)
container_of(head, typeof(*obj), rcu);
struct drm_i915_private *i915 = to_i915(obj->base.dev);
dma_resv_fini(&obj->base._resv);
i915_gem_object_free(obj);
GEM_BUG_ON(!atomic_read(&i915->mm.free_count));
@ -204,10 +220,17 @@ static void __i915_gem_object_free_mmaps(struct drm_i915_gem_object *obj)
}
}
void __i915_gem_free_object(struct drm_i915_gem_object *obj)
/**
* __i915_gem_object_pages_fini - Clean up pages use of a gem object
* @obj: The gem object to clean up
*
* This function cleans up usage of the object mm.pages member. It
* is intended for backends that need to clean up a gem object in
* separate steps and needs to be called when the object is idle before
* the object's backing memory is freed.
*/
void __i915_gem_object_pages_fini(struct drm_i915_gem_object *obj)
{
trace_i915_gem_object_destroy(obj);
if (!list_empty(&obj->vma.list)) {
struct i915_vma *vma;
@ -233,11 +256,17 @@ void __i915_gem_free_object(struct drm_i915_gem_object *obj)
__i915_gem_object_free_mmaps(obj);
GEM_BUG_ON(!list_empty(&obj->lut_list));
atomic_set(&obj->mm.pages_pin_count, 0);
__i915_gem_object_put_pages(obj);
GEM_BUG_ON(i915_gem_object_has_pages(obj));
}
void __i915_gem_free_object(struct drm_i915_gem_object *obj)
{
trace_i915_gem_object_destroy(obj);
GEM_BUG_ON(!list_empty(&obj->lut_list));
bitmap_free(obj->bit_17);
if (obj->base.import_attach)
@ -253,6 +282,8 @@ void __i915_gem_free_object(struct drm_i915_gem_object *obj)
if (obj->shares_resv_from)
i915_vm_resv_put(obj->shares_resv_from);
__i915_gem_object_fini(obj);
}
static void __i915_gem_free_objects(struct drm_i915_private *i915,
@ -266,6 +297,7 @@ static void __i915_gem_free_objects(struct drm_i915_private *i915,
obj->ops->delayed_free(obj);
continue;
}
__i915_gem_object_pages_fini(obj);
__i915_gem_free_object(obj);
/* But keep the pointer alive for RCU-protected lookups */

View File

@ -58,6 +58,9 @@ void i915_gem_object_init(struct drm_i915_gem_object *obj,
const struct drm_i915_gem_object_ops *ops,
struct lock_class_key *key,
unsigned alloc_flags);
void __i915_gem_object_fini(struct drm_i915_gem_object *obj);
struct drm_i915_gem_object *
i915_gem_object_create_shmem(struct drm_i915_private *i915,
resource_size_t size);
@ -269,6 +272,12 @@ i915_gem_object_clear_tiling_quirk(struct drm_i915_gem_object *obj)
clear_bit(I915_TILING_QUIRK_BIT, &obj->flags);
}
static inline bool
i915_gem_object_is_protected(const struct drm_i915_gem_object *obj)
{
return obj->flags & I915_BO_PROTECTED;
}
static inline bool
i915_gem_object_type_has(const struct drm_i915_gem_object *obj,
unsigned long flags)
@ -503,23 +512,6 @@ i915_gem_object_finish_access(struct drm_i915_gem_object *obj)
i915_gem_object_unpin_pages(obj);
}
static inline struct intel_engine_cs *
i915_gem_object_last_write_engine(struct drm_i915_gem_object *obj)
{
struct intel_engine_cs *engine = NULL;
struct dma_fence *fence;
rcu_read_lock();
fence = dma_resv_get_excl_unlocked(obj->base.resv);
rcu_read_unlock();
if (fence && dma_fence_is_i915(fence) && !dma_fence_is_signaled(fence))
engine = to_request(fence)->engine;
dma_fence_put(fence);
return engine;
}
void i915_gem_object_set_cache_coherency(struct drm_i915_gem_object *obj,
unsigned int cache_level);
void i915_gem_object_flush_if_display(struct drm_i915_gem_object *obj);
@ -599,6 +591,8 @@ bool i915_gem_object_is_shmem(const struct drm_i915_gem_object *obj);
void __i915_gem_free_object_rcu(struct rcu_head *head);
void __i915_gem_object_pages_fini(struct drm_i915_gem_object *obj);
void __i915_gem_free_object(struct drm_i915_gem_object *obj);
bool i915_gem_object_evictable(struct drm_i915_gem_object *obj);

View File

@ -288,17 +288,23 @@ struct drm_i915_gem_object {
I915_SELFTEST_DECLARE(struct list_head st_link);
unsigned long flags;
#define I915_BO_ALLOC_CONTIGUOUS BIT(0)
#define I915_BO_ALLOC_VOLATILE BIT(1)
#define I915_BO_ALLOC_CPU_CLEAR BIT(2)
#define I915_BO_ALLOC_USER BIT(3)
#define I915_BO_ALLOC_CONTIGUOUS BIT(0)
#define I915_BO_ALLOC_VOLATILE BIT(1)
#define I915_BO_ALLOC_CPU_CLEAR BIT(2)
#define I915_BO_ALLOC_USER BIT(3)
/* Object is allowed to lose its contents on suspend / resume, even if pinned */
#define I915_BO_ALLOC_PM_VOLATILE BIT(4)
/* Object needs to be restored early using memcpy during resume */
#define I915_BO_ALLOC_PM_EARLY BIT(5)
#define I915_BO_ALLOC_FLAGS (I915_BO_ALLOC_CONTIGUOUS | \
I915_BO_ALLOC_VOLATILE | \
I915_BO_ALLOC_CPU_CLEAR | \
I915_BO_ALLOC_USER)
#define I915_BO_READONLY BIT(4)
#define I915_TILING_QUIRK_BIT 5 /* unknown swizzling; do not release! */
I915_BO_ALLOC_USER | \
I915_BO_ALLOC_PM_VOLATILE | \
I915_BO_ALLOC_PM_EARLY)
#define I915_BO_READONLY BIT(6)
#define I915_TILING_QUIRK_BIT 7 /* unknown swizzling; do not release! */
#define I915_BO_PROTECTED BIT(8)
/**
* @mem_flags - Mutable placement-related flags
*
@ -534,9 +540,17 @@ struct drm_i915_gem_object {
struct {
struct sg_table *cached_io_st;
struct i915_gem_object_page_iter get_io_page;
struct drm_i915_gem_object *backup;
bool created:1;
} ttm;
/*
* Record which PXP key instance this object was created against (if
* any), so we can use it to determine if the encryption is valid by
* comparing against the current key instance.
*/
u32 pxp_key_instance;
/** Record of address bit 17 of each page at last unbind. */
unsigned long *bit_17;

View File

@ -5,6 +5,7 @@
*/
#include "gem/i915_gem_pm.h"
#include "gem/i915_gem_ttm_pm.h"
#include "gt/intel_gt.h"
#include "gt/intel_gt_pm.h"
#include "gt/intel_gt_requests.h"
@ -39,6 +40,88 @@ void i915_gem_suspend(struct drm_i915_private *i915)
i915_gem_drain_freed_objects(i915);
}
static int lmem_restore(struct drm_i915_private *i915, u32 flags)
{
struct intel_memory_region *mr;
int ret = 0, id;
for_each_memory_region(mr, i915, id) {
if (mr->type == INTEL_MEMORY_LOCAL) {
ret = i915_ttm_restore_region(mr, flags);
if (ret)
break;
}
}
return ret;
}
static int lmem_suspend(struct drm_i915_private *i915, u32 flags)
{
struct intel_memory_region *mr;
int ret = 0, id;
for_each_memory_region(mr, i915, id) {
if (mr->type == INTEL_MEMORY_LOCAL) {
ret = i915_ttm_backup_region(mr, flags);
if (ret)
break;
}
}
return ret;
}
static void lmem_recover(struct drm_i915_private *i915)
{
struct intel_memory_region *mr;
int id;
for_each_memory_region(mr, i915, id)
if (mr->type == INTEL_MEMORY_LOCAL)
i915_ttm_recover_region(mr);
}
int i915_gem_backup_suspend(struct drm_i915_private *i915)
{
int ret;
/* Opportunistically try to evict unpinned objects */
ret = lmem_suspend(i915, I915_TTM_BACKUP_ALLOW_GPU);
if (ret)
goto out_recover;
i915_gem_suspend(i915);
/*
* More objects may have become unpinned as requests were
* retired. Now try to evict again. The gt may be wedged here
* in which case we automatically fall back to memcpy.
* We allow also backing up pinned objects that have not been
* marked for early recover, and that may contain, for example,
* page-tables for the migrate context.
*/
ret = lmem_suspend(i915, I915_TTM_BACKUP_ALLOW_GPU |
I915_TTM_BACKUP_PINNED);
if (ret)
goto out_recover;
/*
* Remaining objects are backed up using memcpy once we've stopped
* using the migrate context.
*/
ret = lmem_suspend(i915, I915_TTM_BACKUP_PINNED);
if (ret)
goto out_recover;
return 0;
out_recover:
lmem_recover(i915);
return ret;
}
void i915_gem_suspend_late(struct drm_i915_private *i915)
{
struct drm_i915_gem_object *obj;
@ -128,12 +211,20 @@ int i915_gem_freeze_late(struct drm_i915_private *i915)
void i915_gem_resume(struct drm_i915_private *i915)
{
int ret;
GEM_TRACE("%s\n", dev_name(i915->drm.dev));
ret = lmem_restore(i915, 0);
GEM_WARN_ON(ret);
/*
* As we didn't flush the kernel context before suspend, we cannot
* guarantee that the context image is complete. So let's just reset
* it and start again.
*/
intel_gt_resume(&i915->gt);
ret = lmem_restore(i915, I915_TTM_BACKUP_ALLOW_GPU);
GEM_WARN_ON(ret);
}

View File

@ -18,6 +18,7 @@ void i915_gem_idle_work_handler(struct work_struct *work);
void i915_gem_suspend(struct drm_i915_private *i915);
void i915_gem_suspend_late(struct drm_i915_private *i915);
int i915_gem_backup_suspend(struct drm_i915_private *i915);
int i915_gem_freeze(struct drm_i915_private *i915);
int i915_gem_freeze_late(struct drm_i915_private *i915);

View File

@ -80,3 +80,73 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
i915_gem_object_free(obj);
return ERR_PTR(err);
}
/**
* i915_gem_process_region - Iterate over all objects of a region using ops
* to process and optionally skip objects
* @mr: The memory region
* @apply: ops and private data
*
* This function can be used to iterate over the regions object list,
* checking whether to skip objects, and, if not, lock the objects and
* process them using the supplied ops. Note that this function temporarily
* removes objects from the region list while iterating, so that if run
* concurrently with itself may not iterate over all objects.
*
* Return: 0 if successful, negative error code on failure.
*/
int i915_gem_process_region(struct intel_memory_region *mr,
struct i915_gem_apply_to_region *apply)
{
const struct i915_gem_apply_to_region_ops *ops = apply->ops;
struct drm_i915_gem_object *obj;
struct list_head still_in_list;
int ret = 0;
/*
* In the future, a non-NULL apply->ww could mean the caller is
* already in a locking transaction and provides its own context.
*/
GEM_WARN_ON(apply->ww);
INIT_LIST_HEAD(&still_in_list);
mutex_lock(&mr->objects.lock);
for (;;) {
struct i915_gem_ww_ctx ww;
obj = list_first_entry_or_null(&mr->objects.list, typeof(*obj),
mm.region_link);
if (!obj)
break;
list_move_tail(&obj->mm.region_link, &still_in_list);
if (!kref_get_unless_zero(&obj->base.refcount))
continue;
/*
* Note: Someone else might be migrating the object at this
* point. The object's region is not stable until we lock
* the object.
*/
mutex_unlock(&mr->objects.lock);
apply->ww = &ww;
for_i915_gem_ww(&ww, ret, apply->interruptible) {
ret = i915_gem_object_lock(obj, apply->ww);
if (ret)
continue;
if (obj->mm.region == mr)
ret = ops->process_obj(apply, obj);
/* Implicit object unlock */
}
i915_gem_object_put(obj);
mutex_lock(&mr->objects.lock);
if (ret)
break;
}
list_splice_tail(&still_in_list, &mr->objects.list);
mutex_unlock(&mr->objects.lock);
return ret;
}

View File

@ -12,6 +12,41 @@ struct intel_memory_region;
struct drm_i915_gem_object;
struct sg_table;
struct i915_gem_apply_to_region;
/**
* struct i915_gem_apply_to_region_ops - ops to use when iterating over all
* region objects.
*/
struct i915_gem_apply_to_region_ops {
/**
* process_obj - Process the current object
* @apply: Embed this for private data.
* @obj: The current object.
*
* Note that if this function is part of a ww transaction, and
* if returns -EDEADLK for one of the objects, it may be
* rerun for that same object in the same pass.
*/
int (*process_obj)(struct i915_gem_apply_to_region *apply,
struct drm_i915_gem_object *obj);
};
/**
* struct i915_gem_apply_to_region - Argument to the struct
* i915_gem_apply_to_region_ops functions.
* @ops: The ops for the operation.
* @ww: Locking context used for the transaction.
* @interruptible: Whether to perform object locking interruptible.
*
* This structure is intended to be embedded in a private struct if needed
*/
struct i915_gem_apply_to_region {
const struct i915_gem_apply_to_region_ops *ops;
struct i915_gem_ww_ctx *ww;
u32 interruptible:1;
};
void i915_gem_object_init_memory_region(struct drm_i915_gem_object *obj,
struct intel_memory_region *mem);
void i915_gem_object_release_memory_region(struct drm_i915_gem_object *obj);
@ -22,4 +57,6 @@ i915_gem_object_create_region(struct intel_memory_region *mem,
resource_size_t page_size,
unsigned int flags);
int i915_gem_process_region(struct intel_memory_region *mr,
struct i915_gem_apply_to_region *apply);
#endif

View File

@ -118,7 +118,7 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
intel_wakeref_t wakeref = 0;
unsigned long count = 0;
unsigned long scanned = 0;
int err;
int err = 0;
/* CHV + VTD workaround use stop_machine(); need to trylock vm->mutex */
bool trylock_vm = !ww && intel_vm_no_concurrent_access_wa(i915);
@ -242,12 +242,15 @@ i915_gem_shrink(struct i915_gem_ww_ctx *ww,
list_splice_tail(&still_in_list, phase->list);
spin_unlock_irqrestore(&i915->mm.obj_lock, flags);
if (err)
return err;
break;
}
if (shrink & I915_SHRINK_BOUND)
intel_runtime_pm_put(&i915->runtime_pm, wakeref);
if (err)
return err;
if (nr_scanned)
*nr_scanned += scanned;
return count;

View File

@ -10,18 +10,16 @@
#include "intel_memory_region.h"
#include "intel_region_ttm.h"
#include "gem/i915_gem_mman.h"
#include "gem/i915_gem_object.h"
#include "gem/i915_gem_region.h"
#include "gem/i915_gem_ttm.h"
#include "gem/i915_gem_mman.h"
#include "gem/i915_gem_ttm_pm.h"
#include "gt/intel_migrate.h"
#include "gt/intel_engine_pm.h"
#define I915_PL_LMEM0 TTM_PL_PRIV
#define I915_PL_SYSTEM TTM_PL_SYSTEM
#define I915_PL_STOLEN TTM_PL_VRAM
#define I915_PL_GGTT TTM_PL_TT
#include "gt/intel_gt.h"
#include "gt/intel_migrate.h"
#define I915_TTM_PRIO_PURGE 0
#define I915_TTM_PRIO_NO_PAGES 1
@ -64,6 +62,20 @@ static struct ttm_placement i915_sys_placement = {
.busy_placement = &sys_placement_flags,
};
/**
* i915_ttm_sys_placement - Return the struct ttm_placement to be
* used for an object in system memory.
*
* Rather than making the struct extern, use this
* function.
*
* Return: A pointer to a static variable for sys placement.
*/
struct ttm_placement *i915_ttm_sys_placement(void)
{
return &i915_sys_placement;
}
static int i915_ttm_err_to_gem(int err)
{
/* Fastpath */
@ -356,9 +368,8 @@ static void i915_ttm_delete_mem_notify(struct ttm_buffer_object *bo)
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
if (likely(obj)) {
/* This releases all gem object bindings to the backend. */
__i915_gem_object_pages_fini(obj);
i915_ttm_free_cached_io_st(obj);
__i915_gem_free_object(obj);
}
}
@ -429,7 +440,9 @@ i915_ttm_resource_get_st(struct drm_i915_gem_object *obj,
}
static int i915_ttm_accel_move(struct ttm_buffer_object *bo,
bool clear,
struct ttm_resource *dst_mem,
struct ttm_tt *dst_ttm,
struct sg_table *dst_st)
{
struct drm_i915_private *i915 = container_of(bo->bdev, typeof(*i915),
@ -439,21 +452,18 @@ static int i915_ttm_accel_move(struct ttm_buffer_object *bo,
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
struct sg_table *src_st;
struct i915_request *rq;
struct ttm_tt *ttm = bo->ttm;
struct ttm_tt *src_ttm = bo->ttm;
enum i915_cache_level src_level, dst_level;
int ret;
if (!i915->gt.migrate.context)
if (!i915->gt.migrate.context || intel_gt_is_wedged(&i915->gt))
return -EINVAL;
dst_level = i915_ttm_cache_level(i915, dst_mem, ttm);
if (!ttm || !ttm_tt_is_populated(ttm)) {
dst_level = i915_ttm_cache_level(i915, dst_mem, dst_ttm);
if (clear) {
if (bo->type == ttm_bo_type_kernel)
return -EINVAL;
if (ttm && !(ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC))
return 0;
intel_engine_pm_get(i915->gt.migrate.context->engine);
ret = intel_context_migrate_clear(i915->gt.migrate.context, NULL,
dst_st->sgl, dst_level,
@ -466,10 +476,10 @@ static int i915_ttm_accel_move(struct ttm_buffer_object *bo,
}
intel_engine_pm_put(i915->gt.migrate.context->engine);
} else {
src_st = src_man->use_tt ? i915_ttm_tt_get_st(ttm) :
src_st = src_man->use_tt ? i915_ttm_tt_get_st(src_ttm) :
obj->ttm.cached_io_st;
src_level = i915_ttm_cache_level(i915, bo->resource, ttm);
src_level = i915_ttm_cache_level(i915, bo->resource, src_ttm);
intel_engine_pm_get(i915->gt.migrate.context->engine);
ret = intel_context_migrate_copy(i915->gt.migrate.context,
NULL, src_st->sgl, src_level,
@ -487,6 +497,44 @@ static int i915_ttm_accel_move(struct ttm_buffer_object *bo,
return ret;
}
static void __i915_ttm_move(struct ttm_buffer_object *bo, bool clear,
struct ttm_resource *dst_mem,
struct ttm_tt *dst_ttm,
struct sg_table *dst_st,
bool allow_accel)
{
int ret = -EINVAL;
if (allow_accel)
ret = i915_ttm_accel_move(bo, clear, dst_mem, dst_ttm, dst_st);
if (ret) {
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
struct intel_memory_region *dst_reg, *src_reg;
union {
struct ttm_kmap_iter_tt tt;
struct ttm_kmap_iter_iomap io;
} _dst_iter, _src_iter;
struct ttm_kmap_iter *dst_iter, *src_iter;
dst_reg = i915_ttm_region(bo->bdev, dst_mem->mem_type);
src_reg = i915_ttm_region(bo->bdev, bo->resource->mem_type);
GEM_BUG_ON(!dst_reg || !src_reg);
dst_iter = !cpu_maps_iomem(dst_mem) ?
ttm_kmap_iter_tt_init(&_dst_iter.tt, dst_ttm) :
ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap,
dst_st, dst_reg->region.start);
src_iter = !cpu_maps_iomem(bo->resource) ?
ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) :
ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap,
obj->ttm.cached_io_st,
src_reg->region.start);
ttm_move_memcpy(clear, dst_mem->num_pages, dst_iter, src_iter);
}
}
static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
struct ttm_operation_ctx *ctx,
struct ttm_resource *dst_mem,
@ -495,19 +543,11 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
struct drm_i915_gem_object *obj = i915_ttm_to_gem(bo);
struct ttm_resource_manager *dst_man =
ttm_manager_type(bo->bdev, dst_mem->mem_type);
struct intel_memory_region *dst_reg, *src_reg;
union {
struct ttm_kmap_iter_tt tt;
struct ttm_kmap_iter_iomap io;
} _dst_iter, _src_iter;
struct ttm_kmap_iter *dst_iter, *src_iter;
struct ttm_tt *ttm = bo->ttm;
struct sg_table *dst_st;
bool clear;
int ret;
dst_reg = i915_ttm_region(bo->bdev, dst_mem->mem_type);
src_reg = i915_ttm_region(bo->bdev, bo->resource->mem_type);
GEM_BUG_ON(!dst_reg || !src_reg);
/* Sync for now. We could do the actual copy async. */
ret = ttm_bo_wait_ctx(bo, ctx);
if (ret)
@ -524,9 +564,8 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
}
/* Populate ttm with pages if needed. Typically system memory. */
if (bo->ttm && (dst_man->use_tt ||
(bo->ttm->page_flags & TTM_TT_FLAG_SWAPPED))) {
ret = ttm_tt_populate(bo->bdev, bo->ttm, ctx);
if (ttm && (dst_man->use_tt || (ttm->page_flags & TTM_TT_FLAG_SWAPPED))) {
ret = ttm_tt_populate(bo->bdev, ttm, ctx);
if (ret)
return ret;
}
@ -535,23 +574,10 @@ static int i915_ttm_move(struct ttm_buffer_object *bo, bool evict,
if (IS_ERR(dst_st))
return PTR_ERR(dst_st);
ret = i915_ttm_accel_move(bo, dst_mem, dst_st);
if (ret) {
/* If we start mapping GGTT, we can no longer use man::use_tt here. */
dst_iter = !cpu_maps_iomem(dst_mem) ?
ttm_kmap_iter_tt_init(&_dst_iter.tt, bo->ttm) :
ttm_kmap_iter_iomap_init(&_dst_iter.io, &dst_reg->iomap,
dst_st, dst_reg->region.start);
clear = !cpu_maps_iomem(bo->resource) && (!ttm || !ttm_tt_is_populated(ttm));
if (!(clear && ttm && !(ttm->page_flags & TTM_TT_FLAG_ZERO_ALLOC)))
__i915_ttm_move(bo, clear, dst_mem, bo->ttm, dst_st, true);
src_iter = !cpu_maps_iomem(bo->resource) ?
ttm_kmap_iter_tt_init(&_src_iter.tt, bo->ttm) :
ttm_kmap_iter_iomap_init(&_src_iter.io, &src_reg->iomap,
obj->ttm.cached_io_st,
src_reg->region.start);
ttm_move_memcpy(bo, dst_mem->num_pages, dst_iter, src_iter);
}
/* Below dst_mem becomes bo->resource. */
ttm_bo_move_sync_cleanup(bo, dst_mem);
i915_ttm_adjust_domains_after_move(obj);
i915_ttm_free_cached_io_st(obj);
@ -789,12 +815,9 @@ static void i915_ttm_adjust_lru(struct drm_i915_gem_object *obj)
*/
static void i915_ttm_delayed_free(struct drm_i915_gem_object *obj)
{
if (obj->ttm.created) {
ttm_bo_put(i915_gem_to_ttm(obj));
} else {
__i915_gem_free_object(obj);
call_rcu(&obj->rcu, __i915_gem_free_object_rcu);
}
GEM_BUG_ON(!obj->ttm.created);
ttm_bo_put(i915_gem_to_ttm(obj));
}
static vm_fault_t vm_fault_ttm(struct vm_fault *vmf)
@ -876,8 +899,17 @@ void i915_ttm_bo_destroy(struct ttm_buffer_object *bo)
i915_gem_object_release_memory_region(obj);
mutex_destroy(&obj->ttm.get_io_page.lock);
if (obj->ttm.created)
if (obj->ttm.created) {
i915_ttm_backup_free(obj);
/* This releases all gem object bindings to the backend. */
__i915_gem_free_object(obj);
call_rcu(&obj->rcu, __i915_gem_free_object_rcu);
} else {
__i915_gem_object_fini(obj);
}
}
/**
@ -906,7 +938,11 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
drm_gem_private_object_init(&i915->drm, &obj->base, size);
i915_gem_object_init(obj, &i915_gem_ttm_obj_ops, &lock_class, flags);
i915_gem_object_init_memory_region(obj, mem);
/* Don't put on a region list until we're either locked or fully initialized. */
obj->mm.region = intel_memory_region_get(mem);
INIT_LIST_HEAD(&obj->mm.region_link);
i915_gem_object_make_unshrinkable(obj);
INIT_RADIX_TREE(&obj->ttm.get_io_page.radix, GFP_KERNEL | __GFP_NOWARN);
mutex_init(&obj->ttm.get_io_page.lock);
@ -933,6 +969,8 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
return i915_ttm_err_to_gem(ret);
obj->ttm.created = true;
i915_gem_object_release_memory_region(obj);
i915_gem_object_init_memory_region(obj, mem);
i915_ttm_adjust_domains_after_move(obj);
i915_ttm_adjust_gem_after_move(obj);
i915_gem_object_unlock(obj);
@ -961,3 +999,50 @@ i915_gem_ttm_system_setup(struct drm_i915_private *i915,
intel_memory_region_set_name(mr, "system-ttm");
return mr;
}
/**
* i915_gem_obj_copy_ttm - Copy the contents of one ttm-based gem object to
* another
* @dst: The destination object
* @src: The source object
* @allow_accel: Allow using the blitter. Otherwise TTM memcpy is used.
* @intr: Whether to perform waits interruptible:
*
* Note: The caller is responsible for assuring that the underlying
* TTM objects are populated if needed and locked.
*
* Return: Zero on success. Negative error code on error. If @intr == true,
* then it may return -ERESTARTSYS or -EINTR.
*/
int i915_gem_obj_copy_ttm(struct drm_i915_gem_object *dst,
struct drm_i915_gem_object *src,
bool allow_accel, bool intr)
{
struct ttm_buffer_object *dst_bo = i915_gem_to_ttm(dst);
struct ttm_buffer_object *src_bo = i915_gem_to_ttm(src);
struct ttm_operation_ctx ctx = {
.interruptible = intr,
};
struct sg_table *dst_st;
int ret;
assert_object_held(dst);
assert_object_held(src);
/*
* Sync for now. This will change with async moves.
*/
ret = ttm_bo_wait_ctx(dst_bo, &ctx);
if (!ret)
ret = ttm_bo_wait_ctx(src_bo, &ctx);
if (ret)
return ret;
dst_st = gpu_binds_iomem(dst_bo->resource) ?
dst->ttm.cached_io_st : i915_ttm_tt_get_st(dst_bo->ttm);
__i915_ttm_move(src_bo, false, dst_bo->resource, dst_bo->ttm,
dst_st, allow_accel);
return 0;
}

View File

@ -46,4 +46,18 @@ int __i915_gem_ttm_object_init(struct intel_memory_region *mem,
resource_size_t size,
resource_size_t page_size,
unsigned int flags);
int i915_gem_obj_copy_ttm(struct drm_i915_gem_object *dst,
struct drm_i915_gem_object *src,
bool allow_accel, bool intr);
/* Internal I915 TTM declarations and definitions below. */
#define I915_PL_LMEM0 TTM_PL_PRIV
#define I915_PL_SYSTEM TTM_PL_SYSTEM
#define I915_PL_STOLEN TTM_PL_VRAM
#define I915_PL_GGTT TTM_PL_TT
struct ttm_placement *i915_ttm_sys_placement(void);
#endif

View File

@ -0,0 +1,206 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2021 Intel Corporation
*/
#include <drm/ttm/ttm_placement.h>
#include <drm/ttm/ttm_tt.h>
#include "i915_drv.h"
#include "intel_memory_region.h"
#include "intel_region_ttm.h"
#include "gem/i915_gem_region.h"
#include "gem/i915_gem_ttm.h"
#include "gem/i915_gem_ttm_pm.h"
/**
* i915_ttm_backup_free - Free any backup attached to this object
* @obj: The object whose backup is to be freed.
*/
void i915_ttm_backup_free(struct drm_i915_gem_object *obj)
{
if (obj->ttm.backup) {
i915_gem_object_put(obj->ttm.backup);
obj->ttm.backup = NULL;
}
}
/**
* struct i915_gem_ttm_pm_apply - Apply-to-region subclass for restore
* @base: The i915_gem_apply_to_region we derive from.
* @allow_gpu: Whether using the gpu blitter is allowed.
* @backup_pinned: On backup, backup also pinned objects.
*/
struct i915_gem_ttm_pm_apply {
struct i915_gem_apply_to_region base;
bool allow_gpu : 1;
bool backup_pinned : 1;
};
static int i915_ttm_backup(struct i915_gem_apply_to_region *apply,
struct drm_i915_gem_object *obj)
{
struct i915_gem_ttm_pm_apply *pm_apply =
container_of(apply, typeof(*pm_apply), base);
struct ttm_buffer_object *bo = i915_gem_to_ttm(obj);
struct ttm_buffer_object *backup_bo;
struct drm_i915_private *i915 =
container_of(bo->bdev, typeof(*i915), bdev);
struct drm_i915_gem_object *backup;
struct ttm_operation_ctx ctx = {};
int err = 0;
if (bo->resource->mem_type == I915_PL_SYSTEM || obj->ttm.backup)
return 0;
if (pm_apply->allow_gpu && i915_gem_object_evictable(obj))
return ttm_bo_validate(bo, i915_ttm_sys_placement(), &ctx);
if (!pm_apply->backup_pinned ||
(pm_apply->allow_gpu && (obj->flags & I915_BO_ALLOC_PM_EARLY)))
return 0;
if (obj->flags & I915_BO_ALLOC_PM_VOLATILE)
return 0;
backup = i915_gem_object_create_shmem(i915, obj->base.size);
if (IS_ERR(backup))
return PTR_ERR(backup);
err = i915_gem_object_lock(backup, apply->ww);
if (err)
goto out_no_lock;
backup_bo = i915_gem_to_ttm(backup);
err = ttm_tt_populate(backup_bo->bdev, backup_bo->ttm, &ctx);
if (err)
goto out_no_populate;
err = i915_gem_obj_copy_ttm(backup, obj, pm_apply->allow_gpu, false);
GEM_WARN_ON(err);
obj->ttm.backup = backup;
return 0;
out_no_populate:
i915_gem_ww_unlock_single(backup);
out_no_lock:
i915_gem_object_put(backup);
return err;
}
static int i915_ttm_recover(struct i915_gem_apply_to_region *apply,
struct drm_i915_gem_object *obj)
{
i915_ttm_backup_free(obj);
return 0;
}
/**
* i915_ttm_recover_region - Free the backup of all objects of a region
* @mr: The memory region
*
* Checks all objects of a region if there is backup attached and if so
* frees that backup. Typically this is called to recover after a partially
* performed backup.
*/
void i915_ttm_recover_region(struct intel_memory_region *mr)
{
static const struct i915_gem_apply_to_region_ops recover_ops = {
.process_obj = i915_ttm_recover,
};
struct i915_gem_apply_to_region apply = {.ops = &recover_ops};
int ret;
ret = i915_gem_process_region(mr, &apply);
GEM_WARN_ON(ret);
}
/**
* i915_ttm_backup_region - Back up all objects of a region to smem.
* @mr: The memory region
* @allow_gpu: Whether to allow the gpu blitter for this backup.
* @backup_pinned: Backup also pinned objects.
*
* Loops over all objects of a region and either evicts them if they are
* evictable or backs them up using a backup object if they are pinned.
*
* Return: Zero on success. Negative error code on error.
*/
int i915_ttm_backup_region(struct intel_memory_region *mr, u32 flags)
{
static const struct i915_gem_apply_to_region_ops backup_ops = {
.process_obj = i915_ttm_backup,
};
struct i915_gem_ttm_pm_apply pm_apply = {
.base = {.ops = &backup_ops},
.allow_gpu = flags & I915_TTM_BACKUP_ALLOW_GPU,
.backup_pinned = flags & I915_TTM_BACKUP_PINNED,
};
return i915_gem_process_region(mr, &pm_apply.base);
}
static int i915_ttm_restore(struct i915_gem_apply_to_region *apply,
struct drm_i915_gem_object *obj)
{
struct i915_gem_ttm_pm_apply *pm_apply =
container_of(apply, typeof(*pm_apply), base);
struct drm_i915_gem_object *backup = obj->ttm.backup;
struct ttm_buffer_object *backup_bo = i915_gem_to_ttm(backup);
struct ttm_operation_ctx ctx = {};
int err;
if (!backup)
return 0;
if (!pm_apply->allow_gpu && !(obj->flags & I915_BO_ALLOC_PM_EARLY))
return 0;
err = i915_gem_object_lock(backup, apply->ww);
if (err)
return err;
/* Content may have been swapped. */
err = ttm_tt_populate(backup_bo->bdev, backup_bo->ttm, &ctx);
if (!err) {
err = i915_gem_obj_copy_ttm(obj, backup, pm_apply->allow_gpu,
false);
GEM_WARN_ON(err);
obj->ttm.backup = NULL;
err = 0;
}
i915_gem_ww_unlock_single(backup);
if (!err)
i915_gem_object_put(backup);
return err;
}
/**
* i915_ttm_restore_region - Restore backed-up objects of a region from smem.
* @mr: The memory region
* @allow_gpu: Whether to allow the gpu blitter to recover.
*
* Loops over all objects of a region and if they are backed-up, restores
* them from smem.
*
* Return: Zero on success. Negative error code on error.
*/
int i915_ttm_restore_region(struct intel_memory_region *mr, u32 flags)
{
static const struct i915_gem_apply_to_region_ops restore_ops = {
.process_obj = i915_ttm_restore,
};
struct i915_gem_ttm_pm_apply pm_apply = {
.base = {.ops = &restore_ops},
.allow_gpu = flags & I915_TTM_BACKUP_ALLOW_GPU,
};
return i915_gem_process_region(mr, &pm_apply.base);
}

View File

@ -0,0 +1,26 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2021 Intel Corporation
*/
#ifndef _I915_GEM_TTM_PM_H_
#define _I915_GEM_TTM_PM_H_
#include <linux/types.h>
struct intel_memory_region;
struct drm_i915_gem_object;
#define I915_TTM_BACKUP_ALLOW_GPU BIT(0)
#define I915_TTM_BACKUP_PINNED BIT(1)
int i915_ttm_backup_region(struct intel_memory_region *mr, u32 flags);
void i915_ttm_recover_region(struct intel_memory_region *mr);
int i915_ttm_restore_region(struct intel_memory_region *mr, u32 flags);
/* Internal I915 TTM functions below. */
void i915_ttm_backup_free(struct drm_i915_gem_object *obj);
#endif

View File

@ -6,7 +6,6 @@
#include <linux/fs.h>
#include <linux/mount.h>
#include <linux/pagemap.h>
#include "i915_drv.h"
#include "i915_gemfs.h"
@ -15,6 +14,7 @@ int i915_gemfs_init(struct drm_i915_private *i915)
{
struct file_system_type *type;
struct vfsmount *gemfs;
char *opts;
type = get_fs_type("tmpfs");
if (!type)
@ -26,10 +26,26 @@ int i915_gemfs_init(struct drm_i915_private *i915)
*
* One example, although it is probably better with a per-file
* control, is selecting huge page allocations ("huge=within_size").
* Currently unused due to bandwidth issues (slow reads) on Broadwell+.
* However, we only do so to offset the overhead of iommu lookups
* due to bandwidth issues (slow reads) on Broadwell+.
*/
gemfs = kern_mount(type);
opts = NULL;
if (intel_vtd_active()) {
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
static char huge_opt[] = "huge=within_size"; /* r/w */
opts = huge_opt;
drm_info(&i915->drm,
"Transparent Hugepage mode '%s'\n",
opts);
} else {
drm_notice(&i915->drm,
"Transparent Hugepage support is recommended for optimal performance when IOMMU is enabled!\n");
}
}
gemfs = vfs_kern_mount(type, SB_KERNMOUNT, type->name, opts);
if (IS_ERR(gemfs))
return PTR_ERR(gemfs);

View File

@ -1456,7 +1456,7 @@ static int igt_tmpfs_fallback(void *arg)
struct i915_gem_context *ctx = arg;
struct drm_i915_private *i915 = ctx->i915;
struct vfsmount *gemfs = i915->mm.gemfs;
struct i915_address_space *vm = i915_gem_context_get_vm_rcu(ctx);
struct i915_address_space *vm = i915_gem_context_get_eb_vm(ctx);
struct drm_i915_gem_object *obj;
struct i915_vma *vma;
u32 *vaddr;
@ -1512,13 +1512,14 @@ static int igt_shrink_thp(void *arg)
{
struct i915_gem_context *ctx = arg;
struct drm_i915_private *i915 = ctx->i915;
struct i915_address_space *vm = i915_gem_context_get_vm_rcu(ctx);
struct i915_address_space *vm = i915_gem_context_get_eb_vm(ctx);
struct drm_i915_gem_object *obj;
struct i915_gem_engines_iter it;
struct intel_context *ce;
struct i915_vma *vma;
unsigned int flags = PIN_USER;
unsigned int n;
bool should_swap;
int err = 0;
/*
@ -1567,23 +1568,39 @@ static int igt_shrink_thp(void *arg)
break;
}
i915_gem_context_unlock_engines(ctx);
/*
* Nuke everything *before* we unpin the pages so we can be reasonably
* sure that when later checking get_nr_swap_pages() that some random
* leftover object doesn't steal the remaining swap space.
*/
i915_gem_shrink(NULL, i915, -1UL, NULL,
I915_SHRINK_BOUND |
I915_SHRINK_UNBOUND |
I915_SHRINK_ACTIVE);
i915_vma_unpin(vma);
if (err)
goto out_put;
/*
* Now that the pages are *unpinned* shrink-all should invoke
* shmem to truncate our pages.
* Now that the pages are *unpinned* shrinking should invoke
* shmem to truncate our pages, if we have available swap.
*/
i915_gem_shrink_all(i915);
if (i915_gem_object_has_pages(obj)) {
pr_err("shrink-all didn't truncate the pages\n");
should_swap = get_nr_swap_pages() > 0;
i915_gem_shrink(NULL, i915, -1UL, NULL,
I915_SHRINK_BOUND |
I915_SHRINK_UNBOUND |
I915_SHRINK_ACTIVE |
I915_SHRINK_WRITEBACK);
if (should_swap == i915_gem_object_has_pages(obj)) {
pr_err("unexpected pages mismatch, should_swap=%s\n",
yesno(should_swap));
err = -EINVAL;
goto out_put;
}
if (obj->mm.page_sizes.sg || obj->mm.page_sizes.phys) {
pr_err("residual page-size bits left\n");
if (should_swap == (obj->mm.page_sizes.sg || obj->mm.page_sizes.phys)) {
pr_err("unexpected residual page-size bits, should_swap=%s\n",
yesno(should_swap));
err = -EINVAL;
goto out_put;
}
@ -1629,7 +1646,7 @@ int i915_gem_huge_page_mock_selftests(void)
mkwrite_device_info(dev_priv)->ppgtt_type = INTEL_PPGTT_FULL;
mkwrite_device_info(dev_priv)->ppgtt_size = 48;
ppgtt = i915_ppgtt_create(&dev_priv->gt);
ppgtt = i915_ppgtt_create(&dev_priv->gt, 0);
if (IS_ERR(ppgtt)) {
err = PTR_ERR(ppgtt);
goto out_unlock;
@ -1688,11 +1705,9 @@ int i915_gem_huge_page_live_selftests(struct drm_i915_private *i915)
goto out_file;
}
mutex_lock(&ctx->mutex);
vm = i915_gem_context_vm(ctx);
vm = ctx->vm;
if (vm)
WRITE_ONCE(vm->scrub_64K, true);
mutex_unlock(&ctx->mutex);
err = i915_subtests(tests, ctx);

View File

@ -27,12 +27,6 @@
#define DW_PER_PAGE (PAGE_SIZE / sizeof(u32))
static inline struct i915_address_space *ctx_vm(struct i915_gem_context *ctx)
{
/* single threaded, private ctx */
return rcu_dereference_protected(ctx->vm, true);
}
static int live_nop_switch(void *arg)
{
const unsigned int nctx = 1024;
@ -94,7 +88,7 @@ static int live_nop_switch(void *arg)
rq = i915_request_get(this);
i915_request_add(this);
}
if (i915_request_wait(rq, 0, HZ / 5) < 0) {
if (i915_request_wait(rq, 0, HZ) < 0) {
pr_err("Failed to populated %d contexts\n", nctx);
intel_gt_set_wedged(&i915->gt);
i915_request_put(rq);
@ -704,7 +698,7 @@ static int igt_ctx_exec(void *arg)
pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) [full-ppgtt? %s], err=%d\n",
ndwords, dw, max_dwords(obj),
engine->name,
yesno(!!rcu_access_pointer(ctx->vm)),
yesno(i915_gem_context_has_full_ppgtt(ctx)),
err);
intel_context_put(ce);
kernel_context_close(ctx);
@ -813,7 +807,7 @@ static int igt_shared_ctx_exec(void *arg)
struct i915_gem_context *ctx;
struct intel_context *ce;
ctx = kernel_context(i915, ctx_vm(parent));
ctx = kernel_context(i915, parent->vm);
if (IS_ERR(ctx)) {
err = PTR_ERR(ctx);
goto out_test;
@ -823,7 +817,7 @@ static int igt_shared_ctx_exec(void *arg)
GEM_BUG_ON(IS_ERR(ce));
if (!obj) {
obj = create_test_object(ctx_vm(parent),
obj = create_test_object(parent->vm,
file, &objects);
if (IS_ERR(obj)) {
err = PTR_ERR(obj);
@ -838,7 +832,7 @@ static int igt_shared_ctx_exec(void *arg)
pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) [full-ppgtt? %s], err=%d\n",
ndwords, dw, max_dwords(obj),
engine->name,
yesno(!!rcu_access_pointer(ctx->vm)),
yesno(i915_gem_context_has_full_ppgtt(ctx)),
err);
intel_context_put(ce);
kernel_context_close(ctx);
@ -1380,7 +1374,7 @@ static int igt_ctx_readonly(void *arg)
goto out_file;
}
vm = ctx_vm(ctx) ?: &i915->ggtt.alias->vm;
vm = ctx->vm ?: &i915->ggtt.alias->vm;
if (!vm || !vm->has_read_only) {
err = 0;
goto out_file;
@ -1417,7 +1411,7 @@ static int igt_ctx_readonly(void *arg)
pr_err("Failed to fill dword %lu [%lu/%lu] with gpu (%s) [full-ppgtt? %s], err=%d\n",
ndwords, dw, max_dwords(obj),
ce->engine->name,
yesno(!!ctx_vm(ctx)),
yesno(i915_gem_context_has_full_ppgtt(ctx)),
err);
i915_gem_context_unlock_engines(ctx);
goto out_file;
@ -1499,7 +1493,7 @@ static int write_to_scratch(struct i915_gem_context *ctx,
GEM_BUG_ON(offset < I915_GTT_PAGE_SIZE);
err = check_scratch(ctx_vm(ctx), offset);
err = check_scratch(ctx->vm, offset);
if (err)
return err;
@ -1528,7 +1522,7 @@ static int write_to_scratch(struct i915_gem_context *ctx,
intel_gt_chipset_flush(engine->gt);
vm = i915_gem_context_get_vm_rcu(ctx);
vm = i915_gem_context_get_eb_vm(ctx);
vma = i915_vma_instance(obj, vm, NULL);
if (IS_ERR(vma)) {
err = PTR_ERR(vma);
@ -1596,7 +1590,7 @@ static int read_from_scratch(struct i915_gem_context *ctx,
GEM_BUG_ON(offset < I915_GTT_PAGE_SIZE);
err = check_scratch(ctx_vm(ctx), offset);
err = check_scratch(ctx->vm, offset);
if (err)
return err;
@ -1607,7 +1601,7 @@ static int read_from_scratch(struct i915_gem_context *ctx,
if (GRAPHICS_VER(i915) >= 8) {
const u32 GPR0 = engine->mmio_base + 0x600;
vm = i915_gem_context_get_vm_rcu(ctx);
vm = i915_gem_context_get_eb_vm(ctx);
vma = i915_vma_instance(obj, vm, NULL);
if (IS_ERR(vma)) {
err = PTR_ERR(vma);
@ -1739,7 +1733,7 @@ static int check_scratch_page(struct i915_gem_context *ctx, u32 *out)
u32 *vaddr;
int err = 0;
vm = ctx_vm(ctx);
vm = ctx->vm;
if (!vm)
return -ENODEV;
@ -1801,7 +1795,7 @@ static int igt_vm_isolation(void *arg)
}
/* We can only test vm isolation, if the vm are distinct */
if (ctx_vm(ctx_a) == ctx_vm(ctx_b))
if (ctx_a->vm == ctx_b->vm)
goto out_file;
/* Read the initial state of the scratch page */
@ -1813,8 +1807,8 @@ static int igt_vm_isolation(void *arg)
if (err)
goto out_file;
vm_total = ctx_vm(ctx_a)->total;
GEM_BUG_ON(ctx_vm(ctx_b)->total != vm_total);
vm_total = ctx_a->vm->total;
GEM_BUG_ON(ctx_b->vm->total != vm_total);
count = 0;
num_engines = 0;

View File

@ -1,190 +0,0 @@
// SPDX-License-Identifier: MIT
/*
* Copyright © 2020 Intel Corporation
*/
#include "i915_selftest.h"
#include "gt/intel_engine_pm.h"
#include "selftests/igt_flush_test.h"
static u64 read_reloc(const u32 *map, int x, const u64 mask)
{
u64 reloc;
memcpy(&reloc, &map[x], sizeof(reloc));
return reloc & mask;
}
static int __igt_gpu_reloc(struct i915_execbuffer *eb,
struct drm_i915_gem_object *obj)
{
const unsigned int offsets[] = { 8, 3, 0 };
const u64 mask =
GENMASK_ULL(eb->reloc_cache.use_64bit_reloc ? 63 : 31, 0);
const u32 *map = page_mask_bits(obj->mm.mapping);
struct i915_request *rq;
struct i915_vma *vma;
int err;
int i;
vma = i915_vma_instance(obj, eb->context->vm, NULL);
if (IS_ERR(vma))
return PTR_ERR(vma);
err = i915_gem_object_lock(obj, &eb->ww);
if (err)
return err;
err = i915_vma_pin_ww(vma, &eb->ww, 0, 0, PIN_USER | PIN_HIGH);
if (err)
return err;
/* 8-Byte aligned */
err = __reloc_entry_gpu(eb, vma, offsets[0] * sizeof(u32), 0);
if (err <= 0)
goto reloc_err;
/* !8-Byte aligned */
err = __reloc_entry_gpu(eb, vma, offsets[1] * sizeof(u32), 1);
if (err <= 0)
goto reloc_err;
/* Skip to the end of the cmd page */
i = PAGE_SIZE / sizeof(u32) - 1;
i -= eb->reloc_cache.rq_size;
memset32(eb->reloc_cache.rq_cmd + eb->reloc_cache.rq_size,
MI_NOOP, i);
eb->reloc_cache.rq_size += i;
/* Force next batch */
err = __reloc_entry_gpu(eb, vma, offsets[2] * sizeof(u32), 2);
if (err <= 0)
goto reloc_err;
GEM_BUG_ON(!eb->reloc_cache.rq);
rq = i915_request_get(eb->reloc_cache.rq);
reloc_gpu_flush(eb, &eb->reloc_cache);
GEM_BUG_ON(eb->reloc_cache.rq);
err = i915_gem_object_wait(obj, I915_WAIT_INTERRUPTIBLE, HZ / 2);
if (err) {
intel_gt_set_wedged(eb->engine->gt);
goto put_rq;
}
if (!i915_request_completed(rq)) {
pr_err("%s: did not wait for relocations!\n", eb->engine->name);
err = -EINVAL;
goto put_rq;
}
for (i = 0; i < ARRAY_SIZE(offsets); i++) {
u64 reloc = read_reloc(map, offsets[i], mask);
if (reloc != i) {
pr_err("%s[%d]: map[%d] %llx != %x\n",
eb->engine->name, i, offsets[i], reloc, i);
err = -EINVAL;
}
}
if (err)
igt_hexdump(map, 4096);
put_rq:
i915_request_put(rq);
unpin_vma:
i915_vma_unpin(vma);
return err;
reloc_err:
if (!err)
err = -EIO;
goto unpin_vma;
}
static int igt_gpu_reloc(void *arg)
{
struct i915_execbuffer eb;
struct drm_i915_gem_object *scratch;
int err = 0;
u32 *map;
eb.i915 = arg;
scratch = i915_gem_object_create_internal(eb.i915, 4096);
if (IS_ERR(scratch))
return PTR_ERR(scratch);
map = i915_gem_object_pin_map_unlocked(scratch, I915_MAP_WC);
if (IS_ERR(map)) {
err = PTR_ERR(map);
goto err_scratch;
}
intel_gt_pm_get(&eb.i915->gt);
for_each_uabi_engine(eb.engine, eb.i915) {
if (intel_engine_requires_cmd_parser(eb.engine) ||
intel_engine_using_cmd_parser(eb.engine))
continue;
reloc_cache_init(&eb.reloc_cache, eb.i915);
memset(map, POISON_INUSE, 4096);
intel_engine_pm_get(eb.engine);
eb.context = intel_context_create(eb.engine);
if (IS_ERR(eb.context)) {
err = PTR_ERR(eb.context);
goto err_pm;
}
eb.reloc_pool = NULL;
eb.reloc_context = NULL;
i915_gem_ww_ctx_init(&eb.ww, false);
retry:
err = intel_context_pin_ww(eb.context, &eb.ww);
if (!err) {
err = __igt_gpu_reloc(&eb, scratch);
intel_context_unpin(eb.context);
}
if (err == -EDEADLK) {
err = i915_gem_ww_ctx_backoff(&eb.ww);
if (!err)
goto retry;
}
i915_gem_ww_ctx_fini(&eb.ww);
if (eb.reloc_pool)
intel_gt_buffer_pool_put(eb.reloc_pool);
if (eb.reloc_context)
intel_context_put(eb.reloc_context);
intel_context_put(eb.context);
err_pm:
intel_engine_pm_put(eb.engine);
if (err)
break;
}
if (igt_flush_test(eb.i915))
err = -EIO;
intel_gt_pm_put(&eb.i915->gt);
err_scratch:
i915_gem_object_put(scratch);
return err;
}
int i915_gem_execbuffer_live_selftests(struct drm_i915_private *i915)
{
static const struct i915_subtest tests[] = {
SUBTEST(igt_gpu_reloc),
};
if (intel_gt_is_wedged(&i915->gt))
return 0;
return i915_live_subtests(tests, i915);
}

View File

@ -903,7 +903,9 @@ static int __igt_mmap(struct drm_i915_private *i915,
pr_debug("igt_mmap(%s, %d) @ %lx\n", obj->mm.region->name, type, addr);
mmap_read_lock(current->mm);
area = vma_lookup(current->mm, addr);
mmap_read_unlock(current->mm);
if (!area) {
pr_err("%s: Did not create a vm_area_struct for the mmap\n",
obj->mm.region->name);

View File

@ -23,6 +23,7 @@ mock_context(struct drm_i915_private *i915,
kref_init(&ctx->ref);
INIT_LIST_HEAD(&ctx->link);
ctx->i915 = i915;
INIT_WORK(&ctx->release_work, i915_gem_context_release_work);
mutex_init(&ctx->mutex);
@ -87,7 +88,7 @@ live_context(struct drm_i915_private *i915, struct file *file)
return ERR_CAST(pc);
ctx = i915_gem_create_context(i915, pc);
proto_context_close(pc);
proto_context_close(i915, pc);
if (IS_ERR(ctx))
return ctx;
@ -162,7 +163,7 @@ kernel_context(struct drm_i915_private *i915,
}
ctx = i915_gem_create_context(i915, pc);
proto_context_close(pc);
proto_context_close(i915, pc);
if (IS_ERR(ctx))
return ctx;

View File

@ -1,14 +0,0 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2019 Intel Corporation
*/
#ifndef DEBUGFS_ENGINES_H
#define DEBUGFS_ENGINES_H
struct intel_gt;
struct dentry;
void debugfs_engines_register(struct intel_gt *gt, struct dentry *root);
#endif /* DEBUGFS_ENGINES_H */

View File

@ -1,14 +0,0 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2019 Intel Corporation
*/
#ifndef DEBUGFS_GT_PM_H
#define DEBUGFS_GT_PM_H
struct intel_gt;
struct dentry;
void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry *root);
#endif /* DEBUGFS_GT_PM_H */

View File

@ -429,7 +429,7 @@ struct i915_ppgtt *gen6_ppgtt_create(struct intel_gt *gt)
mutex_init(&ppgtt->flush);
mutex_init(&ppgtt->pin_mutex);
ppgtt_init(&ppgtt->base, gt);
ppgtt_init(&ppgtt->base, gt, 0);
ppgtt->base.vm.pd_shift = ilog2(SZ_4K * SZ_4K / sizeof(gen6_pte_t));
ppgtt->base.vm.top = 1;

View File

@ -548,6 +548,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
I915_GTT_PAGE_SIZE_2M)))) {
vaddr = px_vaddr(pd);
vaddr[maybe_64K] |= GEN8_PDE_IPS_64K;
clflush_cache_range(vaddr, PAGE_SIZE);
page_size = I915_GTT_PAGE_SIZE_64K;
/*
@ -568,6 +569,7 @@ static void gen8_ppgtt_insert_huge(struct i915_vma *vma,
for (i = 1; i < index; i += 16)
memset64(vaddr + i, encode, 15);
clflush_cache_range(vaddr, PAGE_SIZE);
}
}
@ -751,7 +753,8 @@ gen8_alloc_top_pd(struct i915_address_space *vm)
* space.
*
*/
struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
unsigned long lmem_pt_obj_flags)
{
struct i915_ppgtt *ppgtt;
int err;
@ -760,7 +763,7 @@ struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt)
if (!ppgtt)
return ERR_PTR(-ENOMEM);
ppgtt_init(ppgtt, gt);
ppgtt_init(ppgtt, gt, lmem_pt_obj_flags);
ppgtt->vm.top = i915_vm_is_4lvl(&ppgtt->vm) ? 3 : 2;
ppgtt->vm.pd_shift = ilog2(SZ_4K * SZ_4K / sizeof(gen8_pte_t));

View File

@ -12,7 +12,9 @@ struct i915_address_space;
struct intel_gt;
enum i915_cache_level;
struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt);
struct i915_ppgtt *gen8_ppgtt_create(struct intel_gt *gt,
unsigned long lmem_pt_obj_flags);
u64 gen8_ggtt_pte_encode(dma_addr_t addr,
enum i915_cache_level level,
u32 flags);

View File

@ -394,19 +394,18 @@ intel_context_init(struct intel_context *ce, struct intel_engine_cs *engine)
spin_lock_init(&ce->guc_state.lock);
INIT_LIST_HEAD(&ce->guc_state.fences);
INIT_LIST_HEAD(&ce->guc_state.requests);
spin_lock_init(&ce->guc_active.lock);
INIT_LIST_HEAD(&ce->guc_active.requests);
ce->guc_id = GUC_INVALID_LRC_ID;
INIT_LIST_HEAD(&ce->guc_id_link);
ce->guc_id.id = GUC_INVALID_LRC_ID;
INIT_LIST_HEAD(&ce->guc_id.link);
/*
* Initialize fence to be complete as this is expected to be complete
* unless there is a pending schedule disable outstanding.
*/
i915_sw_fence_init(&ce->guc_blocked, sw_fence_dummy_notify);
i915_sw_fence_commit(&ce->guc_blocked);
i915_sw_fence_init(&ce->guc_state.blocked,
sw_fence_dummy_notify);
i915_sw_fence_commit(&ce->guc_state.blocked);
i915_active_init(&ce->active,
__intel_context_active, __intel_context_retire, 0);
@ -420,6 +419,7 @@ void intel_context_fini(struct intel_context *ce)
mutex_destroy(&ce->pin_mutex);
i915_active_fini(&ce->active);
i915_sw_fence_fini(&ce->guc_state.blocked);
}
void i915_context_module_exit(void)
@ -520,15 +520,15 @@ struct i915_request *intel_context_find_active_request(struct intel_context *ce)
GEM_BUG_ON(!intel_engine_uses_guc(ce->engine));
spin_lock_irqsave(&ce->guc_active.lock, flags);
list_for_each_entry_reverse(rq, &ce->guc_active.requests,
spin_lock_irqsave(&ce->guc_state.lock, flags);
list_for_each_entry_reverse(rq, &ce->guc_state.requests,
sched.link) {
if (i915_request_completed(rq))
break;
active = rq;
}
spin_unlock_irqrestore(&ce->guc_active.lock, flags);
spin_unlock_irqrestore(&ce->guc_state.lock, flags);
return active;
}

View File

@ -112,6 +112,7 @@ struct intel_context {
#define CONTEXT_FORCE_SINGLE_SUBMISSION 7
#define CONTEXT_NOPREEMPT 8
#define CONTEXT_LRCA_DIRTY 9
#define CONTEXT_GUC_INIT 10
struct {
u64 timeout_us;
@ -152,52 +153,83 @@ struct intel_context {
/** sseu: Control eu/slice partitioning */
struct intel_sseu sseu;
/**
* pinned_contexts_link: List link for the engine's pinned contexts.
* This is only used if this is a perma-pinned kernel context and
* the list is assumed to only be manipulated during driver load
* or unload time so no mutex protection currently.
*/
struct list_head pinned_contexts_link;
u8 wa_bb_page; /* if set, page num reserved for context workarounds */
struct {
/** lock: protects everything in guc_state */
/** @lock: protects everything in guc_state */
spinlock_t lock;
/**
* sched_state: scheduling state of this context using GuC
* @sched_state: scheduling state of this context using GuC
* submission
*/
u16 sched_state;
u32 sched_state;
/*
* fences: maintains of list of requests that have a submit
* fence related to GuC submission
* @fences: maintains a list of requests that are currently
* being fenced until a GuC operation completes
*/
struct list_head fences;
/**
* @blocked: fence used to signal when the blocking of a
* context's submissions is complete.
*/
struct i915_sw_fence blocked;
/** @number_committed_requests: number of committed requests */
int number_committed_requests;
/** @requests: list of active requests on this context */
struct list_head requests;
/** @prio: the context's current guc priority */
u8 prio;
/**
* @prio_count: a counter of the number requests in flight in
* each priority bucket
*/
u32 prio_count[GUC_CLIENT_PRIORITY_NUM];
} guc_state;
struct {
/** lock: protects everything in guc_active */
spinlock_t lock;
/** requests: active requests on this context */
struct list_head requests;
} guc_active;
/**
* @id: handle which is used to uniquely identify this context
* with the GuC, protected by guc->contexts_lock
*/
u16 id;
/**
* @ref: the number of references to the guc_id, when
* transitioning in and out of zero protected by
* guc->contexts_lock
*/
atomic_t ref;
/**
* @link: in guc->guc_id_list when the guc_id has no refs but is
* still valid, protected by guc->contexts_lock
*/
struct list_head link;
} guc_id;
/* GuC scheduling state flags that do not require a lock. */
atomic_t guc_sched_state_no_lock;
/* GuC LRC descriptor ID */
u16 guc_id;
/* GuC LRC descriptor reference count */
atomic_t guc_id_ref;
/*
* GuC ID link - in list when unpinned but guc_id still valid in GuC
#ifdef CONFIG_DRM_I915_SELFTEST
/**
* @drop_schedule_enable: Force drop of schedule enable G2H for selftest
*/
struct list_head guc_id_link;
bool drop_schedule_enable;
/* GuC context blocked fence */
struct i915_sw_fence guc_blocked;
/*
* GuC priority management
/**
* @drop_schedule_disable: Force drop of schedule disable G2H for
* selftest
*/
u8 guc_prio;
u32 guc_prio_count[GUC_CLIENT_PRIORITY_NUM];
bool drop_schedule_disable;
/**
* @drop_deregister: Force drop of deregister G2H for selftest
*/
bool drop_deregister;
#endif
};
#endif /* __INTEL_CONTEXT_TYPES__ */

View File

@ -175,6 +175,8 @@ intel_write_status_page(struct intel_engine_cs *engine, int reg, u32 value)
#define I915_GEM_HWS_SEQNO 0x40
#define I915_GEM_HWS_SEQNO_ADDR (I915_GEM_HWS_SEQNO * sizeof(u32))
#define I915_GEM_HWS_MIGRATE (0x42 * sizeof(u32))
#define I915_GEM_HWS_PXP 0x60
#define I915_GEM_HWS_PXP_ADDR (I915_GEM_HWS_PXP * sizeof(u32))
#define I915_GEM_HWS_SCRATCH 0x80
#define I915_HWS_CSB_BUF0_INDEX 0x10
@ -273,7 +275,7 @@ static inline bool intel_engine_uses_guc(const struct intel_engine_cs *engine)
static inline bool
intel_engine_has_preempt_reset(const struct intel_engine_cs *engine)
{
if (!IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT))
if (!CONFIG_DRM_I915_PREEMPT_TIMEOUT)
return false;
return intel_engine_has_preemption(engine);
@ -300,7 +302,7 @@ intel_virtual_engine_has_heartbeat(const struct intel_engine_cs *engine)
static inline bool
intel_engine_has_heartbeat(const struct intel_engine_cs *engine)
{
if (!IS_ACTIVE(CONFIG_DRM_I915_HEARTBEAT_INTERVAL))
if (!CONFIG_DRM_I915_HEARTBEAT_INTERVAL)
return false;
if (intel_engine_is_virtual(engine))

View File

@ -320,6 +320,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
BUILD_BUG_ON(BITS_PER_TYPE(engine->mask) < I915_NUM_ENGINES);
INIT_LIST_HEAD(&engine->pinned_contexts_list);
engine->id = id;
engine->legacy_idx = INVALID_ENGINE;
engine->mask = BIT(id);
@ -398,7 +399,8 @@ static void __setup_engine_capabilities(struct intel_engine_cs *engine)
engine->uabi_capabilities |=
I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC;
} else if (engine->class == VIDEO_ENHANCEMENT_CLASS) {
if (GRAPHICS_VER(i915) >= 9)
if (GRAPHICS_VER(i915) >= 9 &&
engine->gt->info.sfc_mask & BIT(engine->instance))
engine->uabi_capabilities |=
I915_VIDEO_AND_ENHANCE_CLASS_CAPABILITY_SFC;
}
@ -474,18 +476,25 @@ void intel_engines_free(struct intel_gt *gt)
}
static
bool gen11_vdbox_has_sfc(struct drm_i915_private *i915,
bool gen11_vdbox_has_sfc(struct intel_gt *gt,
unsigned int physical_vdbox,
unsigned int logical_vdbox, u16 vdbox_mask)
{
struct drm_i915_private *i915 = gt->i915;
/*
* In Gen11, only even numbered logical VDBOXes are hooked
* up to an SFC (Scaler & Format Converter) unit.
* In Gen12, Even numbered physical instance always are connected
* to an SFC. Odd numbered physical instances have SFC only if
* previous even instance is fused off.
*
* Starting with Xe_HP, there's also a dedicated SFC_ENABLE field
* in the fuse register that tells us whether a specific SFC is present.
*/
if (GRAPHICS_VER(i915) == 12)
if ((gt->info.sfc_mask & BIT(physical_vdbox / 2)) == 0)
return false;
else if (GRAPHICS_VER(i915) == 12)
return (physical_vdbox % 2 == 0) ||
!(BIT(physical_vdbox - 1) & vdbox_mask);
else if (GRAPHICS_VER(i915) == 11)
@ -512,7 +521,7 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt)
struct intel_uncore *uncore = gt->uncore;
unsigned int logical_vdbox = 0;
unsigned int i;
u32 media_fuse;
u32 media_fuse, fuse1;
u16 vdbox_mask;
u16 vebox_mask;
@ -534,6 +543,13 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt)
vebox_mask = (media_fuse & GEN11_GT_VEBOX_DISABLE_MASK) >>
GEN11_GT_VEBOX_DISABLE_SHIFT;
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
fuse1 = intel_uncore_read(uncore, HSW_PAVP_FUSE1);
gt->info.sfc_mask = REG_FIELD_GET(XEHP_SFC_ENABLE_MASK, fuse1);
} else {
gt->info.sfc_mask = ~0;
}
for (i = 0; i < I915_MAX_VCS; i++) {
if (!HAS_ENGINE(gt, _VCS(i))) {
vdbox_mask &= ~BIT(i);
@ -546,7 +562,7 @@ static intel_engine_mask_t init_engine_mask(struct intel_gt *gt)
continue;
}
if (gen11_vdbox_has_sfc(i915, i, logical_vdbox, vdbox_mask))
if (gen11_vdbox_has_sfc(gt, i, logical_vdbox, vdbox_mask))
gt->info.vdbox_sfc_access |= BIT(i);
logical_vdbox++;
}
@ -875,6 +891,8 @@ intel_engine_create_pinned_context(struct intel_engine_cs *engine,
return ERR_PTR(err);
}
list_add_tail(&ce->pinned_contexts_link, &engine->pinned_contexts_list);
/*
* Give our perma-pinned kernel timelines a separate lockdep class,
* so that we can use them from within the normal user timelines
@ -897,6 +915,7 @@ void intel_engine_destroy_pinned_context(struct intel_context *ce)
list_del(&ce->timeline->engine_link);
mutex_unlock(&hwsp->vm->mutex);
list_del(&ce->pinned_contexts_link);
intel_context_unpin(ce);
intel_context_put(ce);
}
@ -1163,16 +1182,16 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
u32 mmio_base = engine->mmio_base;
int slice;
int subslice;
int iter;
memset(instdone, 0, sizeof(*instdone));
switch (GRAPHICS_VER(i915)) {
default:
if (GRAPHICS_VER(i915) >= 8) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
if (engine->id != RCS0)
break;
return;
instdone->slice_common =
intel_uncore_read(uncore, GEN7_SC_INSTDONE);
@ -1182,21 +1201,39 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
instdone->slice_common_extra[1] =
intel_uncore_read(uncore, GEN12_SC_INSTDONE_EXTRA2);
}
for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
instdone->sampler[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
GEN7_SAMPLER_INSTDONE);
instdone->row[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
GEN7_ROW_INSTDONE);
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50)) {
for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice) {
instdone->sampler[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
GEN7_SAMPLER_INSTDONE);
instdone->row[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
GEN7_ROW_INSTDONE);
}
} else {
for_each_instdone_slice_subslice(i915, sseu, slice, subslice) {
instdone->sampler[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
GEN7_SAMPLER_INSTDONE);
instdone->row[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
GEN7_ROW_INSTDONE);
}
}
break;
case 7:
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 55)) {
for_each_instdone_gslice_dss_xehp(i915, sseu, iter, slice, subslice)
instdone->geom_svg[slice][subslice] =
read_subslice_reg(engine, slice, subslice,
XEHPG_INSTDONE_GEOM_SVG);
}
} else if (GRAPHICS_VER(i915) >= 7) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
if (engine->id != RCS0)
break;
return;
instdone->slice_common =
intel_uncore_read(uncore, GEN7_SC_INSTDONE);
@ -1204,22 +1241,15 @@ void intel_engine_get_instdone(const struct intel_engine_cs *engine,
intel_uncore_read(uncore, GEN7_SAMPLER_INSTDONE);
instdone->row[0][0] =
intel_uncore_read(uncore, GEN7_ROW_INSTDONE);
break;
case 6:
case 5:
case 4:
} else if (GRAPHICS_VER(i915) >= 4) {
instdone->instdone =
intel_uncore_read(uncore, RING_INSTDONE(mmio_base));
if (engine->id == RCS0)
/* HACK: Using the wrong struct member */
instdone->slice_common =
intel_uncore_read(uncore, GEN4_INSTDONE1);
break;
case 3:
case 2:
} else {
instdone->instdone = intel_uncore_read(uncore, GEN2_INSTDONE);
break;
}
}

View File

@ -207,7 +207,7 @@ static void heartbeat(struct work_struct *wrk)
void intel_engine_unpark_heartbeat(struct intel_engine_cs *engine)
{
if (!IS_ACTIVE(CONFIG_DRM_I915_HEARTBEAT_INTERVAL))
if (!CONFIG_DRM_I915_HEARTBEAT_INTERVAL)
return;
next_heartbeat(engine);

View File

@ -298,6 +298,29 @@ void intel_engine_init__pm(struct intel_engine_cs *engine)
intel_engine_init_heartbeat(engine);
}
/**
* intel_engine_reset_pinned_contexts - Reset the pinned contexts of
* an engine.
* @engine: The engine whose pinned contexts we want to reset.
*
* Typically the pinned context LMEM images lose or get their content
* corrupted on suspend. This function resets their images.
*/
void intel_engine_reset_pinned_contexts(struct intel_engine_cs *engine)
{
struct intel_context *ce;
list_for_each_entry(ce, &engine->pinned_contexts_list,
pinned_contexts_link) {
/* kernel context gets reset at __engine_unpark() */
if (ce == engine->kernel_context)
continue;
dbg_poison_ce(ce);
ce->ops->reset(ce);
}
}
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)
#include "selftest_engine_pm.c"
#endif

View File

@ -69,4 +69,6 @@ intel_engine_create_kernel_request(struct intel_engine_cs *engine)
void intel_engine_init__pm(struct intel_engine_cs *engine);
void intel_engine_reset_pinned_contexts(struct intel_engine_cs *engine);
#endif /* INTEL_ENGINE_PM_H */

View File

@ -67,8 +67,11 @@ struct intel_instdone {
/* The following exist only in the RCS engine */
u32 slice_common;
u32 slice_common_extra[2];
u32 sampler[I915_MAX_SLICES][I915_MAX_SUBSLICES];
u32 row[I915_MAX_SLICES][I915_MAX_SUBSLICES];
u32 sampler[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
u32 row[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
/* Added in XeHPG */
u32 geom_svg[GEN_MAX_GSLICES][I915_MAX_SUBSLICES];
};
/*
@ -304,6 +307,13 @@ struct intel_engine_cs {
struct intel_context *kernel_context; /* pinned */
/**
* pinned_contexts_list: List of pinned contexts. This list is only
* assumed to be manipulated during driver load- or unload time and
* does therefore not have any additional protection.
*/
struct list_head pinned_contexts_list;
intel_engine_mask_t saturated; /* submitting semaphores too late? */
struct {
@ -546,7 +556,7 @@ intel_engine_has_semaphores(const struct intel_engine_cs *engine)
static inline bool
intel_engine_has_timeslices(const struct intel_engine_cs *engine)
{
if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
if (!CONFIG_DRM_I915_TIMESLICE_DURATION)
return false;
return engine->flags & I915_ENGINE_HAS_TIMESLICES;
@ -578,4 +588,12 @@ intel_engine_has_relative_mmio(const struct intel_engine_cs * const engine)
for_each_if((instdone_has_slice(dev_priv_, sseu_, slice_)) && \
(instdone_has_subslice(dev_priv_, sseu_, slice_, \
subslice_)))
#define for_each_instdone_gslice_dss_xehp(dev_priv_, sseu_, iter_, gslice_, dss_) \
for ((iter_) = 0, (gslice_) = 0, (dss_) = 0; \
(iter_) < GEN_MAX_SUBSLICES; \
(iter_)++, (gslice_) = (iter_) / GEN_DSS_PER_GSLICE, \
(dss_) = (iter_) % GEN_DSS_PER_GSLICE) \
for_each_if(intel_sseu_has_subslice((sseu_), 0, (iter_)))
#endif /* __INTEL_ENGINE_TYPES_H__ */

View File

@ -2140,10 +2140,6 @@ static void __execlists_unhold(struct i915_request *rq)
if (p->flags & I915_DEPENDENCY_WEAK)
continue;
/* Propagate any change in error status */
if (rq->fence.error)
i915_request_set_error_once(w, rq->fence.error);
if (w->engine != rq->engine)
continue;
@ -2565,7 +2561,7 @@ __execlists_context_pre_pin(struct intel_context *ce,
if (!__test_and_set_bit(CONTEXT_INIT_BIT, &ce->flags)) {
lrc_init_state(ce, engine, *vaddr);
__i915_gem_object_flush_map(ce->state->obj, 0, engine->context_size);
__i915_gem_object_flush_map(ce->state->obj, 0, engine->context_size);
}
return 0;
@ -2791,6 +2787,8 @@ static void execlists_sanitize(struct intel_engine_cs *engine)
/* And scrub the dirty cachelines for the HWSP */
clflush_cache_range(engine->status_page.addr, PAGE_SIZE);
intel_engine_reset_pinned_contexts(engine);
}
static void enable_error_interrupt(struct intel_engine_cs *engine)
@ -3341,7 +3339,7 @@ logical_ring_default_vfuncs(struct intel_engine_cs *engine)
engine->flags |= I915_ENGINE_HAS_SEMAPHORES;
if (can_preempt(engine)) {
engine->flags |= I915_ENGINE_HAS_PREEMPTION;
if (IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
if (CONFIG_DRM_I915_TIMESLICE_DURATION)
engine->flags |= I915_ENGINE_HAS_TIMESLICES;
}
}

View File

@ -644,7 +644,7 @@ static int init_aliasing_ppgtt(struct i915_ggtt *ggtt)
struct i915_ppgtt *ppgtt;
int err;
ppgtt = i915_ppgtt_create(ggtt->vm.gt);
ppgtt = i915_ppgtt_create(ggtt->vm.gt, 0);
if (IS_ERR(ppgtt))
return PTR_ERR(ppgtt);
@ -727,7 +727,6 @@ static void ggtt_cleanup_hw(struct i915_ggtt *ggtt)
atomic_set(&ggtt->vm.open, 0);
rcu_barrier(); /* flush the RCU'ed__i915_vm_release */
flush_workqueue(ggtt->vm.i915->wq);
mutex_lock(&ggtt->vm.mutex);
@ -814,6 +813,21 @@ static unsigned int chv_get_total_gtt_size(u16 gmch_ctrl)
return 0;
}
static unsigned int gen6_gttmmadr_size(struct drm_i915_private *i915)
{
/*
* GEN6: GTTMMADR size is 4MB and GTTADR starts at 2MB offset
* GEN8: GTTMMADR size is 16MB and GTTADR starts at 8MB offset
*/
GEM_BUG_ON(GRAPHICS_VER(i915) < 6);
return (GRAPHICS_VER(i915) < 8) ? SZ_4M : SZ_16M;
}
static unsigned int gen6_gttadr_offset(struct drm_i915_private *i915)
{
return gen6_gttmmadr_size(i915) / 2;
}
static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
{
struct drm_i915_private *i915 = ggtt->vm.i915;
@ -822,8 +836,8 @@ static int ggtt_probe_common(struct i915_ggtt *ggtt, u64 size)
u32 pte_flags;
int ret;
/* For Modern GENs the PTEs and register space are split in the BAR */
phys_addr = pci_resource_start(pdev, 0) + pci_resource_len(pdev, 0) / 2;
GEM_WARN_ON(pci_resource_len(pdev, 0) != gen6_gttmmadr_size(i915));
phys_addr = pci_resource_start(pdev, 0) + gen6_gttadr_offset(i915);
/*
* On BXT+/ICL+ writes larger than 64 bit to the GTT pagetable range
@ -910,6 +924,7 @@ static int gen8_gmch_probe(struct i915_ggtt *ggtt)
size = gen8_get_total_gtt_size(snb_gmch_ctl);
ggtt->vm.alloc_pt_dma = alloc_pt_dma;
ggtt->vm.lmem_pt_obj_flags = I915_BO_ALLOC_PM_EARLY;
ggtt->vm.total = (size / sizeof(gen8_pte_t)) * I915_GTT_PAGE_SIZE;
ggtt->vm.cleanup = gen6_gmch_remove;

View File

@ -28,10 +28,13 @@
#define INSTR_26_TO_24_MASK 0x7000000
#define INSTR_26_TO_24_SHIFT 24
#define __INSTR(client) ((client) << INSTR_CLIENT_SHIFT)
/*
* Memory interface instructions used by the kernel
*/
#define MI_INSTR(opcode, flags) (((opcode) << 23) | (flags))
#define MI_INSTR(opcode, flags) \
(__INSTR(INSTR_MI_CLIENT) | (opcode) << 23 | (flags))
/* Many MI commands use bit 22 of the header dword for GGTT vs PPGTT */
#define MI_GLOBAL_GTT (1<<22)
@ -57,6 +60,7 @@
#define MI_SUSPEND_FLUSH MI_INSTR(0x0b, 0)
#define MI_SUSPEND_FLUSH_EN (1<<0)
#define MI_SET_APPID MI_INSTR(0x0e, 0)
#define MI_SET_APPID_SESSION_ID(x) ((x) << 0)
#define MI_OVERLAY_FLIP MI_INSTR(0x11, 0)
#define MI_OVERLAY_CONTINUE (0x0<<21)
#define MI_OVERLAY_ON (0x1<<21)
@ -146,6 +150,7 @@
#define MI_STORE_REGISTER_MEM_GEN8 MI_INSTR(0x24, 2)
#define MI_SRM_LRM_GLOBAL_GTT (1<<22)
#define MI_FLUSH_DW MI_INSTR(0x26, 1) /* for GEN6 */
#define MI_FLUSH_DW_PROTECTED_MEM_EN (1 << 22)
#define MI_FLUSH_DW_STORE_INDEX (1<<21)
#define MI_INVALIDATE_TLB (1<<18)
#define MI_FLUSH_DW_OP_STOREDW (1<<14)
@ -272,6 +277,19 @@
#define MI_MATH_REG_ZF 0x32
#define MI_MATH_REG_CF 0x33
/*
* Media instructions used by the kernel
*/
#define MEDIA_INSTR(pipe, op, sub_op, flags) \
(__INSTR(INSTR_RC_CLIENT) | (pipe) << INSTR_SUBCLIENT_SHIFT | \
(op) << INSTR_26_TO_24_SHIFT | (sub_op) << 16 | (flags))
#define MFX_WAIT MEDIA_INSTR(1, 0, 0, 0)
#define MFX_WAIT_DW0_MFX_SYNC_CONTROL_FLAG REG_BIT(8)
#define MFX_WAIT_DW0_PXP_SYNC_CONTROL_FLAG REG_BIT(9)
#define CRYPTO_KEY_EXCHANGE MEDIA_INSTR(2, 6, 9, 0)
/*
* Commands used only by the command parser
*/
@ -328,8 +346,6 @@
#define GFX_OP_3DSTATE_BINDING_TABLE_EDIT_PS \
((0x3<<29)|(0x3<<27)|(0x0<<24)|(0x47<<16))
#define MFX_WAIT ((0x3<<29)|(0x1<<27)|(0x0<<16))
#define COLOR_BLT ((0x2<<29)|(0x40<<22))
#define SRC_COPY_BLT ((0x2<<29)|(0x43<<22))

View File

@ -3,7 +3,7 @@
* Copyright © 2019 Intel Corporation
*/
#include "debugfs_gt.h"
#include "intel_gt_debugfs.h"
#include "gem/i915_gem_lmem.h"
#include "i915_drv.h"
@ -15,12 +15,13 @@
#include "intel_gt_requests.h"
#include "intel_migrate.h"
#include "intel_mocs.h"
#include "intel_pm.h"
#include "intel_rc6.h"
#include "intel_renderstate.h"
#include "intel_rps.h"
#include "intel_uncore.h"
#include "intel_pm.h"
#include "shmem_utils.h"
#include "pxp/intel_pxp.h"
void intel_gt_init_early(struct intel_gt *gt, struct drm_i915_private *i915)
{
@ -434,7 +435,7 @@ void intel_gt_driver_register(struct intel_gt *gt)
{
intel_rps_driver_register(&gt->rps);
debugfs_gt_register(gt);
intel_gt_debugfs_register(gt);
}
static int intel_gt_init_scratch(struct intel_gt *gt, unsigned int size)
@ -481,7 +482,7 @@ static void intel_gt_fini_scratch(struct intel_gt *gt)
static struct i915_address_space *kernel_vm(struct intel_gt *gt)
{
if (INTEL_PPGTT(gt->i915) > INTEL_PPGTT_ALIASING)
return &i915_ppgtt_create(gt)->vm;
return &i915_ppgtt_create(gt, I915_BO_ALLOC_PM_EARLY)->vm;
else
return i915_vm_get(&gt->ggtt->vm);
}
@ -660,6 +661,8 @@ int intel_gt_init(struct intel_gt *gt)
if (err)
return err;
intel_gt_init_workarounds(gt);
/*
* This is just a security blanket to placate dragons.
* On some systems, we very sporadically observe that the first TLBs
@ -682,6 +685,8 @@ int intel_gt_init(struct intel_gt *gt)
goto err_pm;
}
intel_set_mocs_index(gt);
err = intel_engines_init(gt);
if (err)
goto err_engines;
@ -710,6 +715,8 @@ int intel_gt_init(struct intel_gt *gt)
intel_migrate_init(&gt->migrate, gt);
intel_pxp_init(&gt->pxp);
goto out_fw;
err_gt:
__intel_gt_disable(gt);
@ -737,6 +744,8 @@ void intel_gt_driver_remove(struct intel_gt *gt)
intel_uc_driver_remove(&gt->uc);
intel_engines_release(gt);
intel_gt_flush_buffer_pool(gt);
}
void intel_gt_driver_unregister(struct intel_gt *gt)
@ -745,12 +754,14 @@ void intel_gt_driver_unregister(struct intel_gt *gt)
intel_rps_driver_unregister(&gt->rps);
intel_pxp_fini(&gt->pxp);
/*
* Upon unregistering the device to prevent any new users, cancel
* all in-flight requests so that we can quickly unbind the active
* resources.
*/
intel_gt_set_wedged(gt);
intel_gt_set_wedged_on_fini(gt);
/* Scrub all HW state upon release */
with_intel_runtime_pm(gt->uncore->rpm, wakeref)
@ -765,6 +776,7 @@ void intel_gt_driver_release(struct intel_gt *gt)
if (vm) /* FIXME being called twice on error paths :( */
i915_vm_put(vm);
intel_wa_list_free(&gt->wa_list);
intel_gt_pm_fini(gt);
intel_gt_fini_scratch(gt);
intel_gt_fini_buffer_pool(gt);

View File

@ -245,8 +245,6 @@ void intel_gt_fini_buffer_pool(struct intel_gt *gt)
struct intel_gt_buffer_pool *pool = &gt->buffer_pool;
int n;
intel_gt_flush_buffer_pool(gt);
for (n = 0; n < ARRAY_SIZE(pool->cache_list); n++)
GEM_BUG_ON(!list_empty(&pool->cache_list[n]));
}

View File

@ -5,14 +5,15 @@
#include <linux/debugfs.h>
#include "debugfs_engines.h"
#include "debugfs_gt.h"
#include "debugfs_gt_pm.h"
#include "intel_sseu_debugfs.h"
#include "uc/intel_uc_debugfs.h"
#include "i915_drv.h"
#include "intel_gt_debugfs.h"
#include "intel_gt_engines_debugfs.h"
#include "intel_gt_pm_debugfs.h"
#include "intel_sseu_debugfs.h"
#include "pxp/intel_pxp_debugfs.h"
#include "uc/intel_uc_debugfs.h"
void debugfs_gt_register(struct intel_gt *gt)
void intel_gt_debugfs_register(struct intel_gt *gt)
{
struct dentry *root;
@ -23,15 +24,16 @@ void debugfs_gt_register(struct intel_gt *gt)
if (IS_ERR(root))
return;
debugfs_engines_register(gt, root);
debugfs_gt_pm_register(gt, root);
intel_gt_engines_debugfs_register(gt, root);
intel_gt_pm_debugfs_register(gt, root);
intel_sseu_debugfs_register(gt, root);
intel_uc_debugfs_register(&gt->uc, root);
intel_pxp_debugfs_register(&gt->pxp, root);
}
void intel_gt_debugfs_register_files(struct dentry *root,
const struct debugfs_gt_file *files,
const struct intel_gt_debugfs_file *files,
unsigned long count, void *data)
{
while (count--) {

View File

@ -3,14 +3,14 @@
* Copyright © 2019 Intel Corporation
*/
#ifndef DEBUGFS_GT_H
#define DEBUGFS_GT_H
#ifndef INTEL_GT_DEBUGFS_H
#define INTEL_GT_DEBUGFS_H
#include <linux/file.h>
struct intel_gt;
#define DEFINE_GT_DEBUGFS_ATTRIBUTE(__name) \
#define DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(__name) \
static int __name ## _open(struct inode *inode, struct file *file) \
{ \
return single_open(file, __name ## _show, inode->i_private); \
@ -23,16 +23,16 @@ static const struct file_operations __name ## _fops = { \
.release = single_release, \
}
void debugfs_gt_register(struct intel_gt *gt);
void intel_gt_debugfs_register(struct intel_gt *gt);
struct debugfs_gt_file {
struct intel_gt_debugfs_file {
const char *name;
const struct file_operations *fops;
bool (*eval)(void *data);
};
void intel_gt_debugfs_register_files(struct dentry *root,
const struct debugfs_gt_file *files,
const struct intel_gt_debugfs_file *files,
unsigned long count, void *data);
#endif /* DEBUGFS_GT_H */
#endif /* INTEL_GT_DEBUGFS_H */

View File

@ -6,10 +6,10 @@
#include <drm/drm_print.h>
#include "debugfs_engines.h"
#include "debugfs_gt.h"
#include "i915_drv.h" /* for_each_engine! */
#include "intel_engine.h"
#include "intel_gt_debugfs.h"
#include "intel_gt_engines_debugfs.h"
static int engines_show(struct seq_file *m, void *data)
{
@ -24,11 +24,11 @@ static int engines_show(struct seq_file *m, void *data)
return 0;
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(engines);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(engines);
void debugfs_engines_register(struct intel_gt *gt, struct dentry *root)
void intel_gt_engines_debugfs_register(struct intel_gt *gt, struct dentry *root)
{
static const struct debugfs_gt_file files[] = {
static const struct intel_gt_debugfs_file files[] = {
{ "engines", &engines_fops },
};

View File

@ -0,0 +1,14 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2019 Intel Corporation
*/
#ifndef INTEL_GT_ENGINES_DEBUGFS_H
#define INTEL_GT_ENGINES_DEBUGFS_H
struct intel_gt;
struct dentry;
void intel_gt_engines_debugfs_register(struct intel_gt *gt, struct dentry *root);
#endif /* INTEL_GT_ENGINES_DEBUGFS_H */

View File

@ -13,6 +13,7 @@
#include "intel_lrc_reg.h"
#include "intel_uncore.h"
#include "intel_rps.h"
#include "pxp/intel_pxp_irq.h"
static void guc_irq_handler(struct intel_guc *guc, u16 iir)
{
@ -64,6 +65,9 @@ gen11_other_irq_handler(struct intel_gt *gt, const u8 instance,
if (instance == OTHER_GTPM_INSTANCE)
return gen11_rps_irq_handler(&gt->rps, iir);
if (instance == OTHER_KCR_INSTANCE)
return intel_pxp_irq_handler(&gt->pxp, iir);
WARN_ONCE(1, "unhandled other interrupt instance=0x%x, iir=0x%x\n",
instance, iir);
}
@ -196,6 +200,9 @@ void gen11_gt_irq_reset(struct intel_gt *gt)
intel_uncore_write(uncore, GEN11_GPM_WGBOXPERF_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_GUC_SG_INTR_ENABLE, 0);
intel_uncore_write(uncore, GEN11_GUC_SG_INTR_MASK, ~0);
intel_uncore_write(uncore, GEN11_CRYPTO_RSVD_INTR_ENABLE, 0);
intel_uncore_write(uncore, GEN11_CRYPTO_RSVD_INTR_MASK, ~0);
}
void gen11_gt_irq_postinstall(struct intel_gt *gt)

View File

@ -18,6 +18,9 @@
#include "intel_rc6.h"
#include "intel_rps.h"
#include "intel_wakeref.h"
#include "pxp/intel_pxp_pm.h"
#define I915_GT_SUSPEND_IDLE_TIMEOUT (HZ / 2)
static void user_forcewake(struct intel_gt *gt, bool suspend)
{
@ -262,6 +265,8 @@ int intel_gt_resume(struct intel_gt *gt)
intel_uc_resume(&gt->uc);
intel_pxp_resume(&gt->pxp);
user_forcewake(gt, false);
out_fw:
@ -279,7 +284,7 @@ static void wait_for_suspend(struct intel_gt *gt)
if (!intel_gt_pm_is_awake(gt))
return;
if (intel_gt_wait_for_idle(gt, I915_GEM_IDLE_TIMEOUT) == -ETIME) {
if (intel_gt_wait_for_idle(gt, I915_GT_SUSPEND_IDLE_TIMEOUT) == -ETIME) {
/*
* Forcibly cancel outstanding work and leave
* the gpu quiet.
@ -296,7 +301,7 @@ void intel_gt_suspend_prepare(struct intel_gt *gt)
user_forcewake(gt, true);
wait_for_suspend(gt);
intel_uc_suspend(&gt->uc);
intel_pxp_suspend(&gt->pxp, false);
}
static suspend_state_t pm_suspend_target(void)
@ -320,6 +325,8 @@ void intel_gt_suspend_late(struct intel_gt *gt)
GEM_BUG_ON(gt->awake);
intel_uc_suspend(&gt->uc);
/*
* On disabling the device, we want to turn off HW access to memory
* that we no longer own.
@ -346,6 +353,7 @@ void intel_gt_suspend_late(struct intel_gt *gt)
void intel_gt_runtime_suspend(struct intel_gt *gt)
{
intel_pxp_suspend(&gt->pxp, true);
intel_uc_runtime_suspend(&gt->uc);
GT_TRACE(gt, "\n");
@ -353,11 +361,19 @@ void intel_gt_runtime_suspend(struct intel_gt *gt)
int intel_gt_runtime_resume(struct intel_gt *gt)
{
int ret;
GT_TRACE(gt, "\n");
intel_gt_init_swizzling(gt);
intel_ggtt_restore_fences(gt->ggtt);
return intel_uc_runtime_resume(&gt->uc);
ret = intel_uc_runtime_resume(&gt->uc);
if (ret)
return ret;
intel_pxp_resume(&gt->pxp);
return 0;
}
static ktime_t __intel_gt_get_awake_time(const struct intel_gt *gt)

View File

@ -6,12 +6,12 @@
#include <linux/seq_file.h>
#include "debugfs_gt.h"
#include "debugfs_gt_pm.h"
#include "i915_drv.h"
#include "intel_gt.h"
#include "intel_gt_clock_utils.h"
#include "intel_gt_debugfs.h"
#include "intel_gt_pm.h"
#include "intel_gt_pm_debugfs.h"
#include "intel_llc.h"
#include "intel_rc6.h"
#include "intel_rps.h"
@ -36,7 +36,7 @@ static int fw_domains_show(struct seq_file *m, void *data)
return 0;
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(fw_domains);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(fw_domains);
static void print_rc6_res(struct seq_file *m,
const char *title,
@ -238,11 +238,10 @@ static int drpc_show(struct seq_file *m, void *unused)
return err;
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(drpc);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(drpc);
static int frequency_show(struct seq_file *m, void *unused)
void intel_gt_pm_frequency_dump(struct intel_gt *gt, struct drm_printer *p)
{
struct intel_gt *gt = m->private;
struct drm_i915_private *i915 = gt->i915;
struct intel_uncore *uncore = gt->uncore;
struct intel_rps *rps = &gt->rps;
@ -254,21 +253,21 @@ static int frequency_show(struct seq_file *m, void *unused)
u16 rgvswctl = intel_uncore_read16(uncore, MEMSWCTL);
u16 rgvstat = intel_uncore_read16(uncore, MEMSTAT_ILK);
seq_printf(m, "Requested P-state: %d\n", (rgvswctl >> 8) & 0xf);
seq_printf(m, "Requested VID: %d\n", rgvswctl & 0x3f);
seq_printf(m, "Current VID: %d\n", (rgvstat & MEMSTAT_VID_MASK) >>
drm_printf(p, "Requested P-state: %d\n", (rgvswctl >> 8) & 0xf);
drm_printf(p, "Requested VID: %d\n", rgvswctl & 0x3f);
drm_printf(p, "Current VID: %d\n", (rgvstat & MEMSTAT_VID_MASK) >>
MEMSTAT_VID_SHIFT);
seq_printf(m, "Current P-state: %d\n",
drm_printf(p, "Current P-state: %d\n",
(rgvstat & MEMSTAT_PSTATE_MASK) >> MEMSTAT_PSTATE_SHIFT);
} else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) {
u32 rpmodectl, freq_sts;
rpmodectl = intel_uncore_read(uncore, GEN6_RP_CONTROL);
seq_printf(m, "Video Turbo Mode: %s\n",
drm_printf(p, "Video Turbo Mode: %s\n",
yesno(rpmodectl & GEN6_RP_MEDIA_TURBO));
seq_printf(m, "HW control enabled: %s\n",
drm_printf(p, "HW control enabled: %s\n",
yesno(rpmodectl & GEN6_RP_ENABLE));
seq_printf(m, "SW control enabled: %s\n",
drm_printf(p, "SW control enabled: %s\n",
yesno((rpmodectl & GEN6_RP_MEDIA_MODE_MASK) ==
GEN6_RP_MEDIA_SW_MODE));
@ -276,25 +275,25 @@ static int frequency_show(struct seq_file *m, void *unused)
freq_sts = vlv_punit_read(i915, PUNIT_REG_GPU_FREQ_STS);
vlv_punit_put(i915);
seq_printf(m, "PUNIT_REG_GPU_FREQ_STS: 0x%08x\n", freq_sts);
seq_printf(m, "DDR freq: %d MHz\n", i915->mem_freq);
drm_printf(p, "PUNIT_REG_GPU_FREQ_STS: 0x%08x\n", freq_sts);
drm_printf(p, "DDR freq: %d MHz\n", i915->mem_freq);
seq_printf(m, "actual GPU freq: %d MHz\n",
drm_printf(p, "actual GPU freq: %d MHz\n",
intel_gpu_freq(rps, (freq_sts >> 8) & 0xff));
seq_printf(m, "current GPU freq: %d MHz\n",
drm_printf(p, "current GPU freq: %d MHz\n",
intel_gpu_freq(rps, rps->cur_freq));
seq_printf(m, "max GPU freq: %d MHz\n",
drm_printf(p, "max GPU freq: %d MHz\n",
intel_gpu_freq(rps, rps->max_freq));
seq_printf(m, "min GPU freq: %d MHz\n",
drm_printf(p, "min GPU freq: %d MHz\n",
intel_gpu_freq(rps, rps->min_freq));
seq_printf(m, "idle GPU freq: %d MHz\n",
drm_printf(p, "idle GPU freq: %d MHz\n",
intel_gpu_freq(rps, rps->idle_freq));
seq_printf(m, "efficient (RPe) frequency: %d MHz\n",
drm_printf(p, "efficient (RPe) frequency: %d MHz\n",
intel_gpu_freq(rps, rps->efficient_freq));
} else if (GRAPHICS_VER(i915) >= 6) {
u32 rp_state_limits;
@ -309,13 +308,11 @@ static int frequency_show(struct seq_file *m, void *unused)
int max_freq;
rp_state_limits = intel_uncore_read(uncore, GEN6_RP_STATE_LIMITS);
if (IS_GEN9_LP(i915)) {
rp_state_cap = intel_uncore_read(uncore, BXT_RP_STATE_CAP);
rp_state_cap = intel_rps_read_state_cap(rps);
if (IS_GEN9_LP(i915))
gt_perf_status = intel_uncore_read(uncore, BXT_GT_PERF_STATUS);
} else {
rp_state_cap = intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
else
gt_perf_status = intel_uncore_read(uncore, GEN6_GT_PERF_STATUS);
}
/* RPSTAT1 is in the GT power well */
intel_uncore_forcewake_get(uncore, FORCEWAKE_ALL);
@ -376,113 +373,121 @@ static int frequency_show(struct seq_file *m, void *unused)
}
pm_mask = intel_uncore_read(uncore, GEN6_PMINTRMSK);
seq_printf(m, "Video Turbo Mode: %s\n",
drm_printf(p, "Video Turbo Mode: %s\n",
yesno(rpmodectl & GEN6_RP_MEDIA_TURBO));
seq_printf(m, "HW control enabled: %s\n",
drm_printf(p, "HW control enabled: %s\n",
yesno(rpmodectl & GEN6_RP_ENABLE));
seq_printf(m, "SW control enabled: %s\n",
drm_printf(p, "SW control enabled: %s\n",
yesno((rpmodectl & GEN6_RP_MEDIA_MODE_MASK) ==
GEN6_RP_MEDIA_SW_MODE));
seq_printf(m, "PM IER=0x%08x IMR=0x%08x, MASK=0x%08x\n",
drm_printf(p, "PM IER=0x%08x IMR=0x%08x, MASK=0x%08x\n",
pm_ier, pm_imr, pm_mask);
if (GRAPHICS_VER(i915) <= 10)
seq_printf(m, "PM ISR=0x%08x IIR=0x%08x\n",
drm_printf(p, "PM ISR=0x%08x IIR=0x%08x\n",
pm_isr, pm_iir);
seq_printf(m, "pm_intrmsk_mbz: 0x%08x\n",
drm_printf(p, "pm_intrmsk_mbz: 0x%08x\n",
rps->pm_intrmsk_mbz);
seq_printf(m, "GT_PERF_STATUS: 0x%08x\n", gt_perf_status);
seq_printf(m, "Render p-state ratio: %d\n",
drm_printf(p, "GT_PERF_STATUS: 0x%08x\n", gt_perf_status);
drm_printf(p, "Render p-state ratio: %d\n",
(gt_perf_status & (GRAPHICS_VER(i915) >= 9 ? 0x1ff00 : 0xff00)) >> 8);
seq_printf(m, "Render p-state VID: %d\n",
drm_printf(p, "Render p-state VID: %d\n",
gt_perf_status & 0xff);
seq_printf(m, "Render p-state limit: %d\n",
drm_printf(p, "Render p-state limit: %d\n",
rp_state_limits & 0xff);
seq_printf(m, "RPSTAT1: 0x%08x\n", rpstat);
seq_printf(m, "RPMODECTL: 0x%08x\n", rpmodectl);
seq_printf(m, "RPINCLIMIT: 0x%08x\n", rpinclimit);
seq_printf(m, "RPDECLIMIT: 0x%08x\n", rpdeclimit);
seq_printf(m, "RPNSWREQ: %dMHz\n", reqf);
seq_printf(m, "CAGF: %dMHz\n", cagf);
seq_printf(m, "RP CUR UP EI: %d (%lldns)\n",
drm_printf(p, "RPSTAT1: 0x%08x\n", rpstat);
drm_printf(p, "RPMODECTL: 0x%08x\n", rpmodectl);
drm_printf(p, "RPINCLIMIT: 0x%08x\n", rpinclimit);
drm_printf(p, "RPDECLIMIT: 0x%08x\n", rpdeclimit);
drm_printf(p, "RPNSWREQ: %dMHz\n", reqf);
drm_printf(p, "CAGF: %dMHz\n", cagf);
drm_printf(p, "RP CUR UP EI: %d (%lldns)\n",
rpcurupei,
intel_gt_pm_interval_to_ns(gt, rpcurupei));
seq_printf(m, "RP CUR UP: %d (%lldns)\n",
drm_printf(p, "RP CUR UP: %d (%lldns)\n",
rpcurup, intel_gt_pm_interval_to_ns(gt, rpcurup));
seq_printf(m, "RP PREV UP: %d (%lldns)\n",
drm_printf(p, "RP PREV UP: %d (%lldns)\n",
rpprevup, intel_gt_pm_interval_to_ns(gt, rpprevup));
seq_printf(m, "Up threshold: %d%%\n",
drm_printf(p, "Up threshold: %d%%\n",
rps->power.up_threshold);
seq_printf(m, "RP UP EI: %d (%lldns)\n",
drm_printf(p, "RP UP EI: %d (%lldns)\n",
rpupei, intel_gt_pm_interval_to_ns(gt, rpupei));
seq_printf(m, "RP UP THRESHOLD: %d (%lldns)\n",
drm_printf(p, "RP UP THRESHOLD: %d (%lldns)\n",
rpupt, intel_gt_pm_interval_to_ns(gt, rpupt));
seq_printf(m, "RP CUR DOWN EI: %d (%lldns)\n",
drm_printf(p, "RP CUR DOWN EI: %d (%lldns)\n",
rpcurdownei,
intel_gt_pm_interval_to_ns(gt, rpcurdownei));
seq_printf(m, "RP CUR DOWN: %d (%lldns)\n",
drm_printf(p, "RP CUR DOWN: %d (%lldns)\n",
rpcurdown,
intel_gt_pm_interval_to_ns(gt, rpcurdown));
seq_printf(m, "RP PREV DOWN: %d (%lldns)\n",
drm_printf(p, "RP PREV DOWN: %d (%lldns)\n",
rpprevdown,
intel_gt_pm_interval_to_ns(gt, rpprevdown));
seq_printf(m, "Down threshold: %d%%\n",
drm_printf(p, "Down threshold: %d%%\n",
rps->power.down_threshold);
seq_printf(m, "RP DOWN EI: %d (%lldns)\n",
drm_printf(p, "RP DOWN EI: %d (%lldns)\n",
rpdownei, intel_gt_pm_interval_to_ns(gt, rpdownei));
seq_printf(m, "RP DOWN THRESHOLD: %d (%lldns)\n",
drm_printf(p, "RP DOWN THRESHOLD: %d (%lldns)\n",
rpdownt, intel_gt_pm_interval_to_ns(gt, rpdownt));
max_freq = (IS_GEN9_LP(i915) ? rp_state_cap >> 0 :
rp_state_cap >> 16) & 0xff;
max_freq *= (IS_GEN9_BC(i915) ||
GRAPHICS_VER(i915) >= 11 ? GEN9_FREQ_SCALER : 1);
seq_printf(m, "Lowest (RPN) frequency: %dMHz\n",
drm_printf(p, "Lowest (RPN) frequency: %dMHz\n",
intel_gpu_freq(rps, max_freq));
max_freq = (rp_state_cap & 0xff00) >> 8;
max_freq *= (IS_GEN9_BC(i915) ||
GRAPHICS_VER(i915) >= 11 ? GEN9_FREQ_SCALER : 1);
seq_printf(m, "Nominal (RP1) frequency: %dMHz\n",
drm_printf(p, "Nominal (RP1) frequency: %dMHz\n",
intel_gpu_freq(rps, max_freq));
max_freq = (IS_GEN9_LP(i915) ? rp_state_cap >> 16 :
rp_state_cap >> 0) & 0xff;
max_freq *= (IS_GEN9_BC(i915) ||
GRAPHICS_VER(i915) >= 11 ? GEN9_FREQ_SCALER : 1);
seq_printf(m, "Max non-overclocked (RP0) frequency: %dMHz\n",
drm_printf(p, "Max non-overclocked (RP0) frequency: %dMHz\n",
intel_gpu_freq(rps, max_freq));
seq_printf(m, "Max overclocked frequency: %dMHz\n",
drm_printf(p, "Max overclocked frequency: %dMHz\n",
intel_gpu_freq(rps, rps->max_freq));
seq_printf(m, "Current freq: %d MHz\n",
drm_printf(p, "Current freq: %d MHz\n",
intel_gpu_freq(rps, rps->cur_freq));
seq_printf(m, "Actual freq: %d MHz\n", cagf);
seq_printf(m, "Idle freq: %d MHz\n",
drm_printf(p, "Actual freq: %d MHz\n", cagf);
drm_printf(p, "Idle freq: %d MHz\n",
intel_gpu_freq(rps, rps->idle_freq));
seq_printf(m, "Min freq: %d MHz\n",
drm_printf(p, "Min freq: %d MHz\n",
intel_gpu_freq(rps, rps->min_freq));
seq_printf(m, "Boost freq: %d MHz\n",
drm_printf(p, "Boost freq: %d MHz\n",
intel_gpu_freq(rps, rps->boost_freq));
seq_printf(m, "Max freq: %d MHz\n",
drm_printf(p, "Max freq: %d MHz\n",
intel_gpu_freq(rps, rps->max_freq));
seq_printf(m,
drm_printf(p,
"efficient (RPe) frequency: %d MHz\n",
intel_gpu_freq(rps, rps->efficient_freq));
} else {
seq_puts(m, "no P-state info available\n");
drm_puts(p, "no P-state info available\n");
}
seq_printf(m, "Current CD clock frequency: %d kHz\n", i915->cdclk.hw.cdclk);
seq_printf(m, "Max CD clock frequency: %d kHz\n", i915->max_cdclk_freq);
seq_printf(m, "Max pixel clock frequency: %d kHz\n", i915->max_dotclk_freq);
drm_printf(p, "Current CD clock frequency: %d kHz\n", i915->cdclk.hw.cdclk);
drm_printf(p, "Max CD clock frequency: %d kHz\n", i915->max_cdclk_freq);
drm_printf(p, "Max pixel clock frequency: %d kHz\n", i915->max_dotclk_freq);
intel_runtime_pm_put(uncore->rpm, wakeref);
}
static int frequency_show(struct seq_file *m, void *unused)
{
struct intel_gt *gt = m->private;
struct drm_printer p = drm_seq_file_printer(m);
intel_gt_pm_frequency_dump(gt, &p);
return 0;
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(frequency);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(frequency);
static int llc_show(struct seq_file *m, void *data)
{
@ -535,7 +540,7 @@ static bool llc_eval(void *data)
return HAS_LLC(gt->i915);
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(llc);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(llc);
static const char *rps_power_to_str(unsigned int power)
{
@ -614,11 +619,11 @@ static bool rps_eval(void *data)
return HAS_RPS(gt->i915);
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(rps_boost);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(rps_boost);
void debugfs_gt_pm_register(struct intel_gt *gt, struct dentry *root)
void intel_gt_pm_debugfs_register(struct intel_gt *gt, struct dentry *root)
{
static const struct debugfs_gt_file files[] = {
static const struct intel_gt_debugfs_file files[] = {
{ "drpc", &drpc_fops, NULL },
{ "frequency", &frequency_fops, NULL },
{ "forcewake", &fw_domains_fops, NULL },

View File

@ -0,0 +1,16 @@
/* SPDX-License-Identifier: MIT */
/*
* Copyright © 2019 Intel Corporation
*/
#ifndef INTEL_GT_PM_DEBUGFS_H
#define INTEL_GT_PM_DEBUGFS_H
struct intel_gt;
struct dentry;
struct drm_printer;
void intel_gt_pm_debugfs_register(struct intel_gt *gt, struct dentry *root);
void intel_gt_pm_frequency_dump(struct intel_gt *gt, struct drm_printer *m);
#endif /* INTEL_GT_PM_DEBUGFS_H */

View File

@ -26,6 +26,7 @@
#include "intel_rps_types.h"
#include "intel_migrate_types.h"
#include "intel_wakeref.h"
#include "pxp/intel_pxp_types.h"
struct drm_i915_private;
struct i915_ggtt;
@ -72,6 +73,8 @@ struct intel_gt {
struct intel_uc uc;
struct i915_wa_list wa_list;
struct intel_gt_timelines {
spinlock_t lock; /* protects active_list */
struct list_head active_list;
@ -184,6 +187,9 @@ struct intel_gt {
u8 num_engines;
/* General presence of SFC units */
u8 sfc_mask;
/* Media engine access to SFC per instance */
u8 vdbox_sfc_access;
@ -192,6 +198,12 @@ struct intel_gt {
unsigned long mslice_mask;
} info;
struct {
u8 uc_index;
} mocs;
struct intel_pxp pxp;
};
enum intel_gt_scratch_field {

View File

@ -28,7 +28,8 @@ struct drm_i915_gem_object *alloc_pt_lmem(struct i915_address_space *vm, int sz)
* used the passed in size for the page size, which should ensure it
* also has the same alignment.
*/
obj = __i915_gem_object_create_lmem_with_ps(vm->i915, sz, sz, 0);
obj = __i915_gem_object_create_lmem_with_ps(vm->i915, sz, sz,
vm->lmem_pt_obj_flags);
/*
* Ensure all paging structures for this vm share the same dma-resv
* object underneath, with the idea that one object_lock() will lock
@ -155,7 +156,7 @@ void i915_vm_resv_release(struct kref *kref)
static void __i915_vm_release(struct work_struct *work)
{
struct i915_address_space *vm =
container_of(work, struct i915_address_space, rcu.work);
container_of(work, struct i915_address_space, release_work);
vm->cleanup(vm);
i915_address_space_fini(vm);
@ -171,7 +172,7 @@ void i915_vm_release(struct kref *kref)
GEM_BUG_ON(i915_is_ggtt(vm));
trace_i915_ppgtt_release(vm);
queue_rcu_work(vm->i915->wq, &vm->rcu);
queue_work(vm->i915->wq, &vm->release_work);
}
void i915_address_space_init(struct i915_address_space *vm, int subclass)
@ -185,7 +186,7 @@ void i915_address_space_init(struct i915_address_space *vm, int subclass)
if (!kref_read(&vm->resv_ref))
kref_init(&vm->resv_ref);
INIT_RCU_WORK(&vm->rcu, __i915_vm_release);
INIT_WORK(&vm->release_work, __i915_vm_release);
atomic_set(&vm->open, 1);
/*

View File

@ -213,7 +213,7 @@ struct i915_vma_ops {
struct i915_address_space {
struct kref ref;
struct rcu_work rcu;
struct work_struct release_work;
struct drm_mm mm;
struct intel_gt *gt;
@ -260,6 +260,9 @@ struct i915_address_space {
u8 pd_shift;
u8 scratch_order;
/* Flags used when creating page-table objects for this vm */
unsigned long lmem_pt_obj_flags;
struct drm_i915_gem_object *
(*alloc_pt_dma)(struct i915_address_space *vm, int sz);
@ -519,7 +522,8 @@ i915_page_dir_dma_addr(const struct i915_ppgtt *ppgtt, const unsigned int n)
return __px_dma(pt ? px_base(pt) : ppgtt->vm.scratch[ppgtt->vm.top]);
}
void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt);
void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt,
unsigned long lmem_pt_obj_flags);
int i915_ggtt_probe_hw(struct drm_i915_private *i915);
int i915_ggtt_init_hw(struct drm_i915_private *i915);
@ -537,7 +541,8 @@ static inline bool i915_ggtt_has_aperture(const struct i915_ggtt *ggtt)
int i915_ppgtt_init_hw(struct intel_gt *gt);
struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt);
struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt,
unsigned long lmem_pt_obj_flags);
void i915_ggtt_suspend(struct i915_ggtt *gtt);
void i915_ggtt_resume(struct i915_ggtt *ggtt);

View File

@ -226,6 +226,40 @@ static const u8 gen12_xcs_offsets[] = {
END
};
static const u8 dg2_xcs_offsets[] = {
NOP(1),
LRI(15, POSTED),
REG16(0x244),
REG(0x034),
REG(0x030),
REG(0x038),
REG(0x03c),
REG(0x168),
REG(0x140),
REG(0x110),
REG(0x1c0),
REG(0x1c4),
REG(0x1c8),
REG(0x180),
REG16(0x2b4),
REG(0x120),
REG(0x124),
NOP(1),
LRI(9, POSTED),
REG16(0x3a8),
REG16(0x28c),
REG16(0x288),
REG16(0x284),
REG16(0x280),
REG16(0x27c),
REG16(0x278),
REG16(0x274),
REG16(0x270),
END
};
static const u8 gen8_rcs_offsets[] = {
NOP(1),
LRI(14, POSTED),
@ -525,6 +559,49 @@ static const u8 xehp_rcs_offsets[] = {
END
};
static const u8 dg2_rcs_offsets[] = {
NOP(1),
LRI(15, POSTED),
REG16(0x244),
REG(0x034),
REG(0x030),
REG(0x038),
REG(0x03c),
REG(0x168),
REG(0x140),
REG(0x110),
REG(0x1c0),
REG(0x1c4),
REG(0x1c8),
REG(0x180),
REG16(0x2b4),
REG(0x120),
REG(0x124),
NOP(1),
LRI(9, POSTED),
REG16(0x3a8),
REG16(0x28c),
REG16(0x288),
REG16(0x284),
REG16(0x280),
REG16(0x27c),
REG16(0x278),
REG16(0x274),
REG16(0x270),
LRI(3, POSTED),
REG(0x1b0),
REG16(0x5a8),
REG16(0x5ac),
NOP(6),
LRI(1, 0),
REG(0x0c8),
END
};
#undef END
#undef REG16
#undef REG
@ -543,7 +620,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine)
!intel_engine_has_relative_mmio(engine));
if (engine->class == RENDER_CLASS) {
if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55))
return dg2_rcs_offsets;
else if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 50))
return xehp_rcs_offsets;
else if (GRAPHICS_VER(engine->i915) >= 12)
return gen12_rcs_offsets;
@ -554,7 +633,9 @@ static const u8 *reg_offsets(const struct intel_engine_cs *engine)
else
return gen8_rcs_offsets;
} else {
if (GRAPHICS_VER(engine->i915) >= 12)
if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55))
return dg2_xcs_offsets;
else if (GRAPHICS_VER(engine->i915) >= 12)
return gen12_xcs_offsets;
else if (GRAPHICS_VER(engine->i915) >= 9)
return gen9_xcs_offsets;
@ -861,7 +942,8 @@ __lrc_alloc_state(struct intel_context *ce, struct intel_engine_cs *engine)
context_size += PAGE_SIZE;
}
obj = i915_gem_object_create_lmem(engine->i915, context_size, 0);
obj = i915_gem_object_create_lmem(engine->i915, context_size,
I915_BO_ALLOC_PM_VOLATILE);
if (IS_ERR(obj))
obj = i915_gem_object_create_shmem(engine->i915, context_size);
if (IS_ERR(obj))

View File

@ -78,7 +78,7 @@ static struct i915_address_space *migrate_vm(struct intel_gt *gt)
* TODO: Add support for huge LMEM PTEs
*/
vm = i915_ppgtt_create(gt);
vm = i915_ppgtt_create(gt, I915_BO_ALLOC_PM_EARLY);
if (IS_ERR(vm))
return ERR_CAST(vm);

View File

@ -22,6 +22,8 @@ struct drm_i915_mocs_table {
unsigned int size;
unsigned int n_entries;
const struct drm_i915_mocs_entry *table;
u8 uc_index;
u8 unused_entries_index;
};
/* Defines for the tables (XXX_MOCS_0 - XXX_MOCS_63) */
@ -40,6 +42,8 @@ struct drm_i915_mocs_table {
#define L3_ESC(value) ((value) << 0)
#define L3_SCC(value) ((value) << 1)
#define _L3_CACHEABILITY(value) ((value) << 4)
#define L3_GLBGO(value) ((value) << 6)
#define L3_LKUP(value) ((value) << 7)
/* Helper defines */
#define GEN9_NUM_MOCS_ENTRIES 64 /* 63-64 are reserved, but configured. */
@ -88,18 +92,25 @@ struct drm_i915_mocs_table {
*
* Entries not part of the following tables are undefined as far as
* userspace is concerned and shouldn't be relied upon. For Gen < 12
* they will be initialized to PTE. Gen >= 12 onwards don't have a setting for
* PTE and will be initialized to an invalid value.
* they will be initialized to PTE. Gen >= 12 don't have a setting for
* PTE and those platforms except TGL/RKL will be initialized L3 WB to
* catch accidental use of reserved and unused mocs indexes.
*
* The last few entries are reserved by the hardware. For ICL+ they
* should be initialized according to bspec and never used, for older
* platforms they should never be written to.
*
* NOTE: These tables are part of bspec and defined as part of hardware
* NOTE1: These tables are part of bspec and defined as part of hardware
* interface for ICL+. For older platforms, they are part of kernel
* ABI. It is expected that, for specific hardware platform, existing
* entries will remain constant and the table will only be updated by
* adding new entries, filling unused positions.
*
* NOTE2: For GEN >= 12 except TGL and RKL, reserved and unspecified MOCS
* indices have been set to L3 WB. These reserved entries should never
* be used, they may be changed to low performant variants with better
* coherency in the future if more entries are needed.
* For TGL/RKL, all the unspecified MOCS indexes are mapped to L3 UC.
*/
#define GEN9_MOCS_ENTRIES \
MOCS_ENTRY(I915_MOCS_UNCACHED, \
@ -282,17 +293,9 @@ static const struct drm_i915_mocs_entry icl_mocs_table[] = {
};
static const struct drm_i915_mocs_entry dg1_mocs_table[] = {
/* Error */
MOCS_ENTRY(0, 0, L3_0_DIRECT),
/* UC */
MOCS_ENTRY(1, 0, L3_1_UC),
/* Reserved */
MOCS_ENTRY(2, 0, L3_0_DIRECT),
MOCS_ENTRY(3, 0, L3_0_DIRECT),
MOCS_ENTRY(4, 0, L3_0_DIRECT),
/* WB - L3 */
MOCS_ENTRY(5, 0, L3_3_WB),
/* WB - L3 50% */
@ -314,6 +317,83 @@ static const struct drm_i915_mocs_entry dg1_mocs_table[] = {
MOCS_ENTRY(63, 0, L3_1_UC),
};
static const struct drm_i915_mocs_entry gen12_mocs_table[] = {
GEN11_MOCS_ENTRIES,
/* Implicitly enable L1 - HDC:L1 + L3 + LLC */
MOCS_ENTRY(48,
LE_3_WB | LE_TC_1_LLC | LE_LRUM(3),
L3_3_WB),
/* Implicitly enable L1 - HDC:L1 + L3 */
MOCS_ENTRY(49,
LE_1_UC | LE_TC_1_LLC,
L3_3_WB),
/* Implicitly enable L1 - HDC:L1 + LLC */
MOCS_ENTRY(50,
LE_3_WB | LE_TC_1_LLC | LE_LRUM(3),
L3_1_UC),
/* Implicitly enable L1 - HDC:L1 */
MOCS_ENTRY(51,
LE_1_UC | LE_TC_1_LLC,
L3_1_UC),
/* HW Special Case (CCS) */
MOCS_ENTRY(60,
LE_3_WB | LE_TC_1_LLC | LE_LRUM(3),
L3_1_UC),
/* HW Special Case (Displayable) */
MOCS_ENTRY(61,
LE_1_UC | LE_TC_1_LLC,
L3_3_WB),
};
static const struct drm_i915_mocs_entry xehpsdv_mocs_table[] = {
/* wa_1608975824 */
MOCS_ENTRY(0, 0, L3_3_WB | L3_LKUP(1)),
/* UC - Coherent; GO:L3 */
MOCS_ENTRY(1, 0, L3_1_UC | L3_LKUP(1)),
/* UC - Coherent; GO:Memory */
MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
/* UC - Non-Coherent; GO:Memory */
MOCS_ENTRY(3, 0, L3_1_UC | L3_GLBGO(1)),
/* UC - Non-Coherent; GO:L3 */
MOCS_ENTRY(4, 0, L3_1_UC),
/* WB */
MOCS_ENTRY(5, 0, L3_3_WB | L3_LKUP(1)),
/* HW Reserved - SW program but never use. */
MOCS_ENTRY(48, 0, L3_3_WB | L3_LKUP(1)),
MOCS_ENTRY(49, 0, L3_1_UC | L3_LKUP(1)),
MOCS_ENTRY(60, 0, L3_1_UC),
MOCS_ENTRY(61, 0, L3_1_UC),
MOCS_ENTRY(62, 0, L3_1_UC),
MOCS_ENTRY(63, 0, L3_1_UC),
};
static const struct drm_i915_mocs_entry dg2_mocs_table[] = {
/* UC - Coherent; GO:L3 */
MOCS_ENTRY(0, 0, L3_1_UC | L3_LKUP(1)),
/* UC - Coherent; GO:Memory */
MOCS_ENTRY(1, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
/* UC - Non-Coherent; GO:Memory */
MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1)),
/* WB - LC */
MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)),
};
static const struct drm_i915_mocs_entry dg2_mocs_table_g10_ax[] = {
/* Wa_14011441408: Set Go to Memory for MOCS#0 */
MOCS_ENTRY(0, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
/* UC - Coherent; GO:Memory */
MOCS_ENTRY(1, 0, L3_1_UC | L3_GLBGO(1) | L3_LKUP(1)),
/* UC - Non-Coherent; GO:Memory */
MOCS_ENTRY(2, 0, L3_1_UC | L3_GLBGO(1)),
/* WB - LC */
MOCS_ENTRY(3, 0, L3_3_WB | L3_LKUP(1)),
};
enum {
HAS_GLOBAL_MOCS = BIT(0),
HAS_ENGINE_MOCS = BIT(1),
@ -340,14 +420,45 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
{
unsigned int flags;
if (IS_DG1(i915)) {
memset(table, 0, sizeof(struct drm_i915_mocs_table));
table->unused_entries_index = I915_MOCS_PTE;
if (IS_DG2(i915)) {
if (IS_DG2_GT_STEP(i915, G10, STEP_A0, STEP_B0)) {
table->size = ARRAY_SIZE(dg2_mocs_table_g10_ax);
table->table = dg2_mocs_table_g10_ax;
} else {
table->size = ARRAY_SIZE(dg2_mocs_table);
table->table = dg2_mocs_table;
}
table->uc_index = 1;
table->n_entries = GEN9_NUM_MOCS_ENTRIES;
table->unused_entries_index = 3;
} else if (IS_XEHPSDV(i915)) {
table->size = ARRAY_SIZE(xehpsdv_mocs_table);
table->table = xehpsdv_mocs_table;
table->uc_index = 2;
table->n_entries = GEN9_NUM_MOCS_ENTRIES;
table->unused_entries_index = 5;
} else if (IS_DG1(i915)) {
table->size = ARRAY_SIZE(dg1_mocs_table);
table->table = dg1_mocs_table;
table->uc_index = 1;
table->n_entries = GEN9_NUM_MOCS_ENTRIES;
} else if (GRAPHICS_VER(i915) >= 12) {
table->uc_index = 1;
table->unused_entries_index = 5;
} else if (IS_TIGERLAKE(i915) || IS_ROCKETLAKE(i915)) {
/* For TGL/RKL, Can't be changed now for ABI reasons */
table->size = ARRAY_SIZE(tgl_mocs_table);
table->table = tgl_mocs_table;
table->n_entries = GEN9_NUM_MOCS_ENTRIES;
table->uc_index = 3;
} else if (GRAPHICS_VER(i915) >= 12) {
table->size = ARRAY_SIZE(gen12_mocs_table);
table->table = gen12_mocs_table;
table->n_entries = GEN9_NUM_MOCS_ENTRIES;
table->uc_index = 3;
table->unused_entries_index = 2;
} else if (GRAPHICS_VER(i915) == 11) {
table->size = ARRAY_SIZE(icl_mocs_table);
table->table = icl_mocs_table;
@ -393,16 +504,16 @@ static unsigned int get_mocs_settings(const struct drm_i915_private *i915,
}
/*
* Get control_value from MOCS entry taking into account when it's not used:
* I915_MOCS_PTE's value is returned in this case.
* Get control_value from MOCS entry taking into account when it's not used
* then if unused_entries_index is non-zero then its value will be returned
* otherwise I915_MOCS_PTE's value is returned in this case.
*/
static u32 get_entry_control(const struct drm_i915_mocs_table *table,
unsigned int index)
{
if (index < table->size && table->table[index].used)
return table->table[index].control_value;
return table->table[I915_MOCS_PTE].control_value;
return table->table[table->unused_entries_index].control_value;
}
#define for_each_mocs(mocs, t, i) \
@ -417,6 +528,8 @@ static void __init_mocs_table(struct intel_uncore *uncore,
unsigned int i;
u32 mocs;
drm_WARN_ONCE(&uncore->i915->drm, !table->unused_entries_index,
"Unused entries index should have been defined\n");
for_each_mocs(mocs, table, i)
intel_uncore_write_fw(uncore, _MMIO(addr + i * 4), mocs);
}
@ -443,16 +556,16 @@ static void init_mocs_table(struct intel_engine_cs *engine,
}
/*
* Get l3cc_value from MOCS entry taking into account when it's not used:
* I915_MOCS_PTE's value is returned in this case.
* Get l3cc_value from MOCS entry taking into account when it's not used
* then if unused_entries_index is not zero then its value will be returned
* otherwise I915_MOCS_PTE's value is returned in this case.
*/
static u16 get_entry_l3cc(const struct drm_i915_mocs_table *table,
unsigned int index)
{
if (index < table->size && table->table[index].used)
return table->table[index].l3cc_value;
return table->table[I915_MOCS_PTE].l3cc_value;
return table->table[table->unused_entries_index].l3cc_value;
}
static u32 l3cc_combine(u16 low, u16 high)
@ -468,10 +581,9 @@ static u32 l3cc_combine(u16 low, u16 high)
0; \
i++)
static void init_l3cc_table(struct intel_engine_cs *engine,
static void init_l3cc_table(struct intel_uncore *uncore,
const struct drm_i915_mocs_table *table)
{
struct intel_uncore *uncore = engine->uncore;
unsigned int i;
u32 l3cc;
@ -496,7 +608,7 @@ void intel_mocs_init_engine(struct intel_engine_cs *engine)
init_mocs_table(engine, &table);
if (flags & HAS_RENDER_L3CC && engine->class == RENDER_CLASS)
init_l3cc_table(engine, &table);
init_l3cc_table(engine->uncore, &table);
}
static u32 global_mocs_offset(void)
@ -504,6 +616,14 @@ static u32 global_mocs_offset(void)
return i915_mmio_reg_offset(GEN12_GLOBAL_MOCS(0));
}
void intel_set_mocs_index(struct intel_gt *gt)
{
struct drm_i915_mocs_table table;
get_mocs_settings(gt->i915, &table);
gt->mocs.uc_index = table.uc_index;
}
void intel_mocs_init(struct intel_gt *gt)
{
struct drm_i915_mocs_table table;
@ -515,6 +635,14 @@ void intel_mocs_init(struct intel_gt *gt)
flags = get_mocs_settings(gt->i915, &table);
if (flags & HAS_GLOBAL_MOCS)
__init_mocs_table(gt->uncore, &table, global_mocs_offset());
/*
* Initialize the L3CC table as part of mocs initalization to make
* sure the LNCFCMOCSx registers are programmed for the subsequent
* memory transactions including guc transactions
*/
if (flags & HAS_RENDER_L3CC)
init_l3cc_table(gt->uncore, &table);
}
#if IS_ENABLED(CONFIG_DRM_I915_SELFTEST)

View File

@ -36,5 +36,6 @@ struct intel_gt;
void intel_mocs_init(struct intel_gt *gt);
void intel_mocs_init_engine(struct intel_engine_cs *engine);
void intel_set_mocs_index(struct intel_gt *gt);
#endif

View File

@ -155,19 +155,20 @@ int i915_ppgtt_init_hw(struct intel_gt *gt)
}
static struct i915_ppgtt *
__ppgtt_create(struct intel_gt *gt)
__ppgtt_create(struct intel_gt *gt, unsigned long lmem_pt_obj_flags)
{
if (GRAPHICS_VER(gt->i915) < 8)
return gen6_ppgtt_create(gt);
else
return gen8_ppgtt_create(gt);
return gen8_ppgtt_create(gt, lmem_pt_obj_flags);
}
struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt)
struct i915_ppgtt *i915_ppgtt_create(struct intel_gt *gt,
unsigned long lmem_pt_obj_flags)
{
struct i915_ppgtt *ppgtt;
ppgtt = __ppgtt_create(gt);
ppgtt = __ppgtt_create(gt, lmem_pt_obj_flags);
if (IS_ERR(ppgtt))
return ppgtt;
@ -298,7 +299,8 @@ int ppgtt_set_pages(struct i915_vma *vma)
return 0;
}
void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt)
void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt,
unsigned long lmem_pt_obj_flags)
{
struct drm_i915_private *i915 = gt->i915;
@ -306,6 +308,7 @@ void ppgtt_init(struct i915_ppgtt *ppgtt, struct intel_gt *gt)
ppgtt->vm.i915 = i915;
ppgtt->vm.dma = i915->drm.dev;
ppgtt->vm.total = BIT_ULL(INTEL_INFO(i915)->ppgtt_size);
ppgtt->vm.lmem_pt_obj_flags = lmem_pt_obj_flags;
dma_resv_init(&ppgtt->vm._resv);
i915_address_space_init(&ppgtt->vm, VM_CLASS_PPGTT);

View File

@ -32,7 +32,7 @@ static int init_fake_lmem_bar(struct intel_memory_region *mem)
mem->remap_addr = dma_map_resource(i915->drm.dev,
mem->region.start,
mem->fake_mappable.size,
PCI_DMA_BIDIRECTIONAL,
DMA_BIDIRECTIONAL,
DMA_ATTR_FORCE_CONTIGUOUS);
if (dma_mapping_error(i915->drm.dev, mem->remap_addr)) {
drm_mm_remove_node(&mem->fake_mappable);
@ -62,7 +62,7 @@ static void release_fake_lmem_bar(struct intel_memory_region *mem)
dma_unmap_resource(mem->i915->drm.dev,
mem->remap_addr,
mem->fake_mappable.size,
PCI_DMA_BIDIRECTIONAL,
DMA_BIDIRECTIONAL,
DMA_ATTR_FORCE_CONTIGUOUS);
}

View File

@ -112,7 +112,8 @@ static struct i915_vma *create_ring_vma(struct i915_ggtt *ggtt, int size)
struct drm_i915_gem_object *obj;
struct i915_vma *vma;
obj = i915_gem_object_create_lmem(i915, size, I915_BO_ALLOC_VOLATILE);
obj = i915_gem_object_create_lmem(i915, size, I915_BO_ALLOC_VOLATILE |
I915_BO_ALLOC_PM_VOLATILE);
if (IS_ERR(obj) && i915_ggtt_has_aperture(ggtt))
obj = i915_gem_object_create_stolen(i915, size);
if (IS_ERR(obj))

View File

@ -17,6 +17,7 @@
#include "intel_ring.h"
#include "shmem_utils.h"
#include "intel_engine_heartbeat.h"
#include "intel_engine_pm.h"
/* Rough estimate of the typical request size, performing a flush,
* set-context and then emitting the batch.
@ -292,6 +293,8 @@ static void xcs_sanitize(struct intel_engine_cs *engine)
/* And scrub the dirty cachelines for the HWSP */
clflush_cache_range(engine->status_page.addr, PAGE_SIZE);
intel_engine_reset_pinned_contexts(engine);
}
static void reset_prepare(struct intel_engine_cs *engine)
@ -1265,7 +1268,7 @@ static struct i915_vma *gen7_ctx_vma(struct intel_engine_cs *engine)
int size, err;
if (GRAPHICS_VER(engine->i915) != 7 || engine->class != RENDER_CLASS)
return 0;
return NULL;
err = gen7_ctx_switch_bb_setup(engine, NULL /* probe size */);
if (err < 0)

View File

@ -882,8 +882,6 @@ void intel_rps_park(struct intel_rps *rps)
if (!intel_rps_is_enabled(rps))
return;
GEM_BUG_ON(atomic_read(&rps->num_waiters));
if (!intel_rps_clear_active(rps))
return;
@ -996,20 +994,16 @@ int intel_rps_set(struct intel_rps *rps, u8 val)
static void gen6_rps_init(struct intel_rps *rps)
{
struct drm_i915_private *i915 = rps_to_i915(rps);
struct intel_uncore *uncore = rps_to_uncore(rps);
u32 rp_state_cap = intel_rps_read_state_cap(rps);
/* All of these values are in units of 50MHz */
/* static values from HW: RP0 > RP1 > RPn (min_freq) */
if (IS_GEN9_LP(i915)) {
u32 rp_state_cap = intel_uncore_read(uncore, BXT_RP_STATE_CAP);
rps->rp0_freq = (rp_state_cap >> 16) & 0xff;
rps->rp1_freq = (rp_state_cap >> 8) & 0xff;
rps->min_freq = (rp_state_cap >> 0) & 0xff;
} else {
u32 rp_state_cap = intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
rps->rp0_freq = (rp_state_cap >> 0) & 0xff;
rps->rp1_freq = (rp_state_cap >> 8) & 0xff;
rps->min_freq = (rp_state_cap >> 16) & 0xff;
@ -2146,6 +2140,19 @@ int intel_rps_set_min_frequency(struct intel_rps *rps, u32 val)
return set_min_freq(rps, val);
}
u32 intel_rps_read_state_cap(struct intel_rps *rps)
{
struct drm_i915_private *i915 = rps_to_i915(rps);
struct intel_uncore *uncore = rps_to_uncore(rps);
if (IS_XEHPSDV(i915))
return intel_uncore_read(uncore, XEHPSDV_RP_STATE_CAP);
else if (IS_GEN9_LP(i915))
return intel_uncore_read(uncore, BXT_RP_STATE_CAP);
else
return intel_uncore_read(uncore, GEN6_RP_STATE_CAP);
}
/* External interface for intel_ips.ko */
static struct drm_i915_private __rcu *ips_mchdev;

View File

@ -41,6 +41,7 @@ u32 intel_rps_get_rp1_frequency(struct intel_rps *rps);
u32 intel_rps_get_rpn_frequency(struct intel_rps *rps);
u32 intel_rps_read_punit_req(struct intel_rps *rps);
u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps);
u32 intel_rps_read_state_cap(struct intel_rps *rps);
void gen5_rps_irq_handler(struct intel_rps *rps);
void gen6_rps_irq_handler(struct intel_rps *rps, u32 pm_iir);

View File

@ -46,11 +46,11 @@ u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice)
}
void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
u32 ss_mask)
u8 *subslice_mask, u32 ss_mask)
{
int offset = slice * sseu->ss_stride;
memcpy(&sseu->subslice_mask[offset], &ss_mask, sseu->ss_stride);
memcpy(&subslice_mask[offset], &ss_mask, sseu->ss_stride);
}
unsigned int
@ -100,14 +100,24 @@ static u16 compute_eu_total(const struct sseu_dev_info *sseu)
return total;
}
static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
u8 s_en, u32 ss_en, u16 eu_en)
static u32 get_ss_stride_mask(struct sseu_dev_info *sseu, u8 s, u32 ss_en)
{
u32 ss_mask;
ss_mask = ss_en >> (s * sseu->max_subslices);
ss_mask &= GENMASK(sseu->max_subslices - 1, 0);
return ss_mask;
}
static void gen11_compute_sseu_info(struct sseu_dev_info *sseu, u8 s_en,
u32 g_ss_en, u32 c_ss_en, u16 eu_en)
{
int s, ss;
/* ss_en represents entire subslice mask across all slices */
/* g_ss_en/c_ss_en represent entire subslice mask across all slices */
GEM_BUG_ON(sseu->max_slices * sseu->max_subslices >
sizeof(ss_en) * BITS_PER_BYTE);
sizeof(g_ss_en) * BITS_PER_BYTE);
for (s = 0; s < sseu->max_slices; s++) {
if ((s_en & BIT(s)) == 0)
@ -115,7 +125,22 @@ static void gen11_compute_sseu_info(struct sseu_dev_info *sseu,
sseu->slice_mask |= BIT(s);
intel_sseu_set_subslices(sseu, s, ss_en);
/*
* XeHP introduces the concept of compute vs geometry DSS. To
* reduce variation between GENs around subslice usage, store a
* mask for both the geometry and compute enabled masks since
* userspace will need to be able to query these masks
* independently. Also compute a total enabled subslice count
* for the purposes of selecting subslices to use in a
* particular GEM context.
*/
intel_sseu_set_subslices(sseu, s, sseu->compute_subslice_mask,
get_ss_stride_mask(sseu, s, c_ss_en));
intel_sseu_set_subslices(sseu, s, sseu->geometry_subslice_mask,
get_ss_stride_mask(sseu, s, g_ss_en));
intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
get_ss_stride_mask(sseu, s,
g_ss_en | c_ss_en));
for (ss = 0; ss < sseu->max_subslices; ss++)
if (intel_sseu_has_subslice(sseu, s, ss))
@ -129,7 +154,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
{
struct sseu_dev_info *sseu = &gt->info.sseu;
struct intel_uncore *uncore = gt->uncore;
u32 dss_en;
u32 g_dss_en, c_dss_en = 0;
u16 eu_en = 0;
u8 eu_en_fuse;
u8 s_en;
@ -160,7 +185,9 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
s_en = intel_uncore_read(uncore, GEN11_GT_SLICE_ENABLE) &
GEN11_GT_S_ENA_MASK;
dss_en = intel_uncore_read(uncore, GEN12_GT_DSS_ENABLE);
g_dss_en = intel_uncore_read(uncore, GEN12_GT_GEOMETRY_DSS_ENABLE);
if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
c_dss_en = intel_uncore_read(uncore, GEN12_GT_COMPUTE_DSS_ENABLE);
/* one bit per pair of EUs */
if (GRAPHICS_VER_FULL(gt->i915) >= IP_VER(12, 50))
@ -173,7 +200,7 @@ static void gen12_sseu_info_init(struct intel_gt *gt)
if (eu_en_fuse & BIT(eu))
eu_en |= BIT(eu * 2) | BIT(eu * 2 + 1);
gen11_compute_sseu_info(sseu, s_en, dss_en, eu_en);
gen11_compute_sseu_info(sseu, s_en, g_dss_en, c_dss_en, eu_en);
/* TGL only supports slice-level power gating */
sseu->has_slice_pg = 1;
@ -199,7 +226,7 @@ static void gen11_sseu_info_init(struct intel_gt *gt)
eu_en = ~(intel_uncore_read(uncore, GEN11_EU_DISABLE) &
GEN11_EU_DIS_MASK);
gen11_compute_sseu_info(sseu, s_en, ss_en, eu_en);
gen11_compute_sseu_info(sseu, s_en, ss_en, 0, eu_en);
/* ICL has no power gating restrictions. */
sseu->has_slice_pg = 1;
@ -240,7 +267,7 @@ static void cherryview_sseu_info_init(struct intel_gt *gt)
sseu_set_eus(sseu, 0, 1, ~disabled_mask);
}
intel_sseu_set_subslices(sseu, 0, subslice_mask);
intel_sseu_set_subslices(sseu, 0, sseu->subslice_mask, subslice_mask);
sseu->eu_total = compute_eu_total(sseu);
@ -296,7 +323,8 @@ static void gen9_sseu_info_init(struct intel_gt *gt)
/* skip disabled slice */
continue;
intel_sseu_set_subslices(sseu, s, subslice_mask);
intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
subslice_mask);
eu_disable = intel_uncore_read(uncore, GEN9_EU_DISABLE(s));
for (ss = 0; ss < sseu->max_subslices; ss++) {
@ -408,7 +436,8 @@ static void bdw_sseu_info_init(struct intel_gt *gt)
/* skip disabled slice */
continue;
intel_sseu_set_subslices(sseu, s, subslice_mask);
intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
subslice_mask);
for (ss = 0; ss < sseu->max_subslices; ss++) {
u8 eu_disabled_mask;
@ -485,10 +514,9 @@ static void hsw_sseu_info_init(struct intel_gt *gt)
}
fuse1 = intel_uncore_read(gt->uncore, HSW_PAVP_FUSE1);
switch ((fuse1 & HSW_F1_EU_DIS_MASK) >> HSW_F1_EU_DIS_SHIFT) {
switch (REG_FIELD_GET(HSW_F1_EU_DIS_MASK, fuse1)) {
default:
MISSING_CASE((fuse1 & HSW_F1_EU_DIS_MASK) >>
HSW_F1_EU_DIS_SHIFT);
MISSING_CASE(REG_FIELD_GET(HSW_F1_EU_DIS_MASK, fuse1));
fallthrough;
case HSW_F1_EU_DIS_10EUS:
sseu->eu_per_subslice = 10;
@ -506,7 +534,8 @@ static void hsw_sseu_info_init(struct intel_gt *gt)
sseu->eu_per_subslice);
for (s = 0; s < sseu->max_slices; s++) {
intel_sseu_set_subslices(sseu, s, subslice_mask);
intel_sseu_set_subslices(sseu, s, sseu->subslice_mask,
subslice_mask);
for (ss = 0; ss < sseu->max_subslices; ss++) {
sseu_set_eus(sseu, s, ss,

View File

@ -26,9 +26,14 @@ struct drm_printer;
#define GEN_DSS_PER_CSLICE 8
#define GEN_DSS_PER_MSLICE 8
#define GEN_MAX_GSLICES (GEN_MAX_SUBSLICES / GEN_DSS_PER_GSLICE)
#define GEN_MAX_CSLICES (GEN_MAX_SUBSLICES / GEN_DSS_PER_CSLICE)
struct sseu_dev_info {
u8 slice_mask;
u8 subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
u8 geometry_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
u8 compute_subslice_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICE_STRIDE];
u8 eu_mask[GEN_MAX_SLICES * GEN_MAX_SUBSLICES * GEN_MAX_EU_STRIDE];
u16 eu_total;
u8 eu_per_subslice;
@ -78,6 +83,10 @@ intel_sseu_has_subslice(const struct sseu_dev_info *sseu, int slice,
u8 mask;
int ss_idx = subslice / BITS_PER_BYTE;
if (slice >= sseu->max_slices ||
subslice >= sseu->max_subslices)
return false;
GEM_BUG_ON(ss_idx >= sseu->ss_stride);
mask = sseu->subslice_mask[slice * sseu->ss_stride + ss_idx];
@ -97,7 +106,7 @@ intel_sseu_subslices_per_slice(const struct sseu_dev_info *sseu, u8 slice);
u32 intel_sseu_get_subslices(const struct sseu_dev_info *sseu, u8 slice);
void intel_sseu_set_subslices(struct sseu_dev_info *sseu, int slice,
u32 ss_mask);
u8 *subslice_mask, u32 ss_mask);
void intel_sseu_info_init(struct intel_gt *gt);

View File

@ -4,9 +4,9 @@
* Copyright © 2020 Intel Corporation
*/
#include "debugfs_gt.h"
#include "intel_sseu_debugfs.h"
#include "i915_drv.h"
#include "intel_gt_debugfs.h"
#include "intel_sseu_debugfs.h"
static void sseu_copy_subslices(const struct sseu_dev_info *sseu,
int slice, u8 *to_mask)
@ -282,7 +282,7 @@ static int sseu_status_show(struct seq_file *m, void *unused)
return intel_sseu_status(m, gt);
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(sseu_status);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(sseu_status);
static int rcs_topology_show(struct seq_file *m, void *unused)
{
@ -293,11 +293,11 @@ static int rcs_topology_show(struct seq_file *m, void *unused)
return 0;
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(rcs_topology);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(rcs_topology);
void intel_sseu_debugfs_register(struct intel_gt *gt, struct dentry *root)
{
static const struct debugfs_gt_file files[] = {
static const struct intel_gt_debugfs_file files[] = {
{ "sseu_status", &sseu_status_fops, NULL },
{ "rcs_topology", &rcs_topology_fops, NULL },
};

View File

@ -644,6 +644,72 @@ static void dg1_ctx_workarounds_init(struct intel_engine_cs *engine,
DG1_HZ_READ_SUPPRESSION_OPTIMIZATION_DISABLE);
}
static void fakewa_disable_nestedbb_mode(struct intel_engine_cs *engine,
struct i915_wa_list *wal)
{
/*
* This is a "fake" workaround defined by software to ensure we
* maintain reliable, backward-compatible behavior for userspace with
* regards to how nested MI_BATCH_BUFFER_START commands are handled.
*
* The per-context setting of MI_MODE[12] determines whether the bits
* of a nested MI_BATCH_BUFFER_START instruction should be interpreted
* in the traditional manner or whether they should instead use a new
* tgl+ meaning that breaks backward compatibility, but allows nesting
* into 3rd-level batchbuffers. When this new capability was first
* added in TGL, it remained off by default unless a context
* intentionally opted in to the new behavior. However Xe_HPG now
* flips this on by default and requires that we explicitly opt out if
* we don't want the new behavior.
*
* From a SW perspective, we want to maintain the backward-compatible
* behavior for userspace, so we'll apply a fake workaround to set it
* back to the legacy behavior on platforms where the hardware default
* is to break compatibility. At the moment there is no Linux
* userspace that utilizes third-level batchbuffers, so this will avoid
* userspace from needing to make any changes. using the legacy
* meaning is the correct thing to do. If/when we have userspace
* consumers that want to utilize third-level batch nesting, we can
* provide a context parameter to allow them to opt-in.
*/
wa_masked_dis(wal, RING_MI_MODE(engine->mmio_base), TGL_NESTED_BB_EN);
}
static void gen12_ctx_gt_mocs_init(struct intel_engine_cs *engine,
struct i915_wa_list *wal)
{
u8 mocs;
/*
* Some blitter commands do not have a field for MOCS, those
* commands will use MOCS index pointed by BLIT_CCTL.
* BLIT_CCTL registers are needed to be programmed to un-cached.
*/
if (engine->class == COPY_ENGINE_CLASS) {
mocs = engine->gt->mocs.uc_index;
wa_write_clr_set(wal,
BLIT_CCTL(engine->mmio_base),
BLIT_CCTL_MASK,
BLIT_CCTL_MOCS(mocs, mocs));
}
}
/*
* gen12_ctx_gt_fake_wa_init() aren't programmingan official workaround
* defined by the hardware team, but it programming general context registers.
* Adding those context register programming in context workaround
* allow us to use the wa framework for proper application and validation.
*/
static void
gen12_ctx_gt_fake_wa_init(struct intel_engine_cs *engine,
struct i915_wa_list *wal)
{
if (GRAPHICS_VER_FULL(engine->i915) >= IP_VER(12, 55))
fakewa_disable_nestedbb_mode(engine, wal);
gen12_ctx_gt_mocs_init(engine, wal);
}
static void
__intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
struct i915_wa_list *wal,
@ -651,11 +717,19 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
{
struct drm_i915_private *i915 = engine->i915;
if (engine->class != RENDER_CLASS)
return;
wa_init_start(wal, name, engine->name);
/* Applies to all engines */
/*
* Fake workarounds are not the actual workaround but
* programming of context registers using workaround framework.
*/
if (GRAPHICS_VER(i915) >= 12)
gen12_ctx_gt_fake_wa_init(engine, wal);
if (engine->class != RENDER_CLASS)
goto done;
if (IS_DG1(i915))
dg1_ctx_workarounds_init(engine, wal);
else if (GRAPHICS_VER(i915) == 12)
@ -685,6 +759,7 @@ __intel_engine_init_ctx_wa(struct intel_engine_cs *engine,
else
MISSING_CASE(GRAPHICS_VER(i915));
done:
wa_init_finish(wal);
}
@ -729,7 +804,7 @@ int intel_engine_emit_ctx_wa(struct i915_request *rq)
}
static void
gen4_gt_workarounds_init(struct drm_i915_private *i915,
gen4_gt_workarounds_init(struct intel_gt *gt,
struct i915_wa_list *wal)
{
/* WaDisable_RenderCache_OperationalFlush:gen4,ilk */
@ -737,29 +812,29 @@ gen4_gt_workarounds_init(struct drm_i915_private *i915,
}
static void
g4x_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
g4x_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
gen4_gt_workarounds_init(i915, wal);
gen4_gt_workarounds_init(gt, wal);
/* WaDisableRenderCachePipelinedFlush:g4x,ilk */
wa_masked_en(wal, CACHE_MODE_0, CM0_PIPELINED_RENDER_FLUSH_DISABLE);
}
static void
ilk_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
ilk_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
g4x_gt_workarounds_init(i915, wal);
g4x_gt_workarounds_init(gt, wal);
wa_masked_en(wal, _3D_CHICKEN2, _3D_CHICKEN2_WM_READ_PIPELINED);
}
static void
snb_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
snb_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
}
static void
ivb_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
ivb_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
/* Apply the WaDisableRHWOOptimizationForRenderHang:ivb workaround. */
wa_masked_dis(wal,
@ -775,7 +850,7 @@ ivb_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
}
static void
vlv_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
vlv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
/* WaForceL3Serialization:vlv */
wa_write_clr(wal, GEN7_L3SQCREG4, L3SQ_URB_READ_CAM_MATCH_DISABLE);
@ -788,7 +863,7 @@ vlv_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
}
static void
hsw_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
hsw_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
/* L3 caching of data atomics doesn't work -- disable it. */
wa_write(wal, HSW_SCRATCH1, HSW_SCRATCH1_L3_DATA_ATOMICS_DISABLE);
@ -803,8 +878,10 @@ hsw_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
}
static void
gen9_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
gen9_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
struct drm_i915_private *i915 = gt->i915;
/* WaDisableKillLogic:bxt,skl,kbl */
if (!IS_COFFEELAKE(i915) && !IS_COMETLAKE(i915))
wa_write_or(wal,
@ -829,9 +906,9 @@ gen9_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal
}
static void
skl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
skl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
gen9_gt_workarounds_init(i915, wal);
gen9_gt_workarounds_init(gt, wal);
/* WaDisableGafsUnitClkGating:skl */
wa_write_or(wal,
@ -839,19 +916,19 @@ skl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
GEN8_EU_GAUNIT_CLOCK_GATE_DISABLE);
/* WaInPlaceDecompressionHang:skl */
if (IS_SKL_GT_STEP(i915, STEP_A0, STEP_H0))
if (IS_SKL_GT_STEP(gt->i915, STEP_A0, STEP_H0))
wa_write_or(wal,
GEN9_GAMT_ECO_REG_RW_IA,
GAMT_ECO_ENABLE_IN_PLACE_DECOMPRESS);
}
static void
kbl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
kbl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
gen9_gt_workarounds_init(i915, wal);
gen9_gt_workarounds_init(gt, wal);
/* WaDisableDynamicCreditSharing:kbl */
if (IS_KBL_GT_STEP(i915, 0, STEP_C0))
if (IS_KBL_GT_STEP(gt->i915, 0, STEP_C0))
wa_write_or(wal,
GAMT_CHKN_BIT_REG,
GAMT_CHKN_DISABLE_DYNAMIC_CREDIT_SHARING);
@ -868,15 +945,15 @@ kbl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
}
static void
glk_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
glk_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
gen9_gt_workarounds_init(i915, wal);
gen9_gt_workarounds_init(gt, wal);
}
static void
cfl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
cfl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
gen9_gt_workarounds_init(i915, wal);
gen9_gt_workarounds_init(gt, wal);
/* WaDisableGafsUnitClkGating:cfl */
wa_write_or(wal,
@ -901,21 +978,21 @@ static void __set_mcr_steering(struct i915_wa_list *wal,
wa_write_clr_set(wal, steering_reg, mcr_mask, mcr);
}
static void __add_mcr_wa(struct drm_i915_private *i915, struct i915_wa_list *wal,
static void __add_mcr_wa(struct intel_gt *gt, struct i915_wa_list *wal,
unsigned int slice, unsigned int subslice)
{
drm_dbg(&i915->drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice);
drm_dbg(&gt->i915->drm, "MCR slice=0x%x, subslice=0x%x\n", slice, subslice);
__set_mcr_steering(wal, GEN8_MCR_SELECTOR, slice, subslice);
}
static void
icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
icl_wa_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
{
const struct sseu_dev_info *sseu = &i915->gt.info.sseu;
const struct sseu_dev_info *sseu = &gt->info.sseu;
unsigned int slice, subslice;
GEM_BUG_ON(GRAPHICS_VER(i915) < 11);
GEM_BUG_ON(GRAPHICS_VER(gt->i915) < 11);
GEM_BUG_ON(hweight8(sseu->slice_mask) > 1);
slice = 0;
@ -935,16 +1012,15 @@ icl_wa_init_mcr(struct drm_i915_private *i915, struct i915_wa_list *wal)
* then we can just rely on the default steering and won't need to
* worry about explicitly re-steering L3BANK reads later.
*/
if (i915->gt.info.l3bank_mask & BIT(subslice))
i915->gt.steering_table[L3BANK] = NULL;
if (gt->info.l3bank_mask & BIT(subslice))
gt->steering_table[L3BANK] = NULL;
__add_mcr_wa(i915, wal, slice, subslice);
__add_mcr_wa(gt, wal, slice, subslice);
}
static void
xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
{
struct drm_i915_private *i915 = gt->i915;
const struct sseu_dev_info *sseu = &gt->info.sseu;
unsigned long slice, subslice = 0, slice_mask = 0;
u64 dss_mask = 0;
@ -1008,7 +1084,7 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
WARN_ON(subslice > GEN_DSS_PER_GSLICE);
WARN_ON(dss_mask >> (slice * GEN_DSS_PER_GSLICE) == 0);
__add_mcr_wa(i915, wal, slice, subslice);
__add_mcr_wa(gt, wal, slice, subslice);
/*
* SQIDI ranges are special because they use different steering
@ -1024,9 +1100,11 @@ xehp_init_mcr(struct intel_gt *gt, struct i915_wa_list *wal)
}
static void
icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
icl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
icl_wa_init_mcr(i915, wal);
struct drm_i915_private *i915 = gt->i915;
icl_wa_init_mcr(gt, wal);
/* WaModifyGamTlbPartitioning:icl */
wa_write_clr_set(wal,
@ -1077,10 +1155,9 @@ icl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
* the engine-specific workaround list.
*/
static void
wa_14011060649(struct drm_i915_private *i915, struct i915_wa_list *wal)
wa_14011060649(struct intel_gt *gt, struct i915_wa_list *wal)
{
struct intel_engine_cs *engine;
struct intel_gt *gt = &i915->gt;
int id;
for_each_engine(engine, gt, id) {
@ -1094,22 +1171,23 @@ wa_14011060649(struct drm_i915_private *i915, struct i915_wa_list *wal)
}
static void
gen12_gt_workarounds_init(struct drm_i915_private *i915,
struct i915_wa_list *wal)
gen12_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
icl_wa_init_mcr(i915, wal);
icl_wa_init_mcr(gt, wal);
/* Wa_14011060649:tgl,rkl,dg1,adl-s,adl-p */
wa_14011060649(i915, wal);
wa_14011060649(gt, wal);
/* Wa_14011059788:tgl,rkl,adl-s,dg1,adl-p */
wa_write_or(wal, GEN10_DFR_RATIO_EN_AND_CHICKEN, DFR_DISABLE);
}
static void
tgl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
tgl_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
gen12_gt_workarounds_init(i915, wal);
struct drm_i915_private *i915 = gt->i915;
gen12_gt_workarounds_init(gt, wal);
/* Wa_1409420604:tgl */
if (IS_TGL_UY_GT_STEP(i915, STEP_A0, STEP_B0))
@ -1130,9 +1208,11 @@ tgl_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
}
static void
dg1_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
dg1_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
gen12_gt_workarounds_init(i915, wal);
struct drm_i915_private *i915 = gt->i915;
gen12_gt_workarounds_init(gt, wal);
/* Wa_1607087056:dg1 */
if (IS_DG1_GT_STEP(i915, STEP_A0, STEP_B0))
@ -1154,60 +1234,62 @@ dg1_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
}
static void
xehpsdv_gt_workarounds_init(struct drm_i915_private *i915, struct i915_wa_list *wal)
xehpsdv_gt_workarounds_init(struct intel_gt *gt, struct i915_wa_list *wal)
{
xehp_init_mcr(&i915->gt, wal);
xehp_init_mcr(gt, wal);
}
static void
gt_init_workarounds(struct drm_i915_private *i915, struct i915_wa_list *wal)
gt_init_workarounds(struct intel_gt *gt, struct i915_wa_list *wal)
{
struct drm_i915_private *i915 = gt->i915;
if (IS_XEHPSDV(i915))
xehpsdv_gt_workarounds_init(i915, wal);
xehpsdv_gt_workarounds_init(gt, wal);
else if (IS_DG1(i915))
dg1_gt_workarounds_init(i915, wal);
dg1_gt_workarounds_init(gt, wal);
else if (IS_TIGERLAKE(i915))
tgl_gt_workarounds_init(i915, wal);
tgl_gt_workarounds_init(gt, wal);
else if (GRAPHICS_VER(i915) == 12)
gen12_gt_workarounds_init(i915, wal);
gen12_gt_workarounds_init(gt, wal);
else if (GRAPHICS_VER(i915) == 11)
icl_gt_workarounds_init(i915, wal);
icl_gt_workarounds_init(gt, wal);
else if (IS_COFFEELAKE(i915) || IS_COMETLAKE(i915))
cfl_gt_workarounds_init(i915, wal);
cfl_gt_workarounds_init(gt, wal);
else if (IS_GEMINILAKE(i915))
glk_gt_workarounds_init(i915, wal);
glk_gt_workarounds_init(gt, wal);
else if (IS_KABYLAKE(i915))
kbl_gt_workarounds_init(i915, wal);
kbl_gt_workarounds_init(gt, wal);
else if (IS_BROXTON(i915))
gen9_gt_workarounds_init(i915, wal);
gen9_gt_workarounds_init(gt, wal);
else if (IS_SKYLAKE(i915))
skl_gt_workarounds_init(i915, wal);
skl_gt_workarounds_init(gt, wal);
else if (IS_HASWELL(i915))
hsw_gt_workarounds_init(i915, wal);
hsw_gt_workarounds_init(gt, wal);
else if (IS_VALLEYVIEW(i915))
vlv_gt_workarounds_init(i915, wal);
vlv_gt_workarounds_init(gt, wal);
else if (IS_IVYBRIDGE(i915))
ivb_gt_workarounds_init(i915, wal);
ivb_gt_workarounds_init(gt, wal);
else if (GRAPHICS_VER(i915) == 6)
snb_gt_workarounds_init(i915, wal);
snb_gt_workarounds_init(gt, wal);
else if (GRAPHICS_VER(i915) == 5)
ilk_gt_workarounds_init(i915, wal);
ilk_gt_workarounds_init(gt, wal);
else if (IS_G4X(i915))
g4x_gt_workarounds_init(i915, wal);
g4x_gt_workarounds_init(gt, wal);
else if (GRAPHICS_VER(i915) == 4)
gen4_gt_workarounds_init(i915, wal);
gen4_gt_workarounds_init(gt, wal);
else if (GRAPHICS_VER(i915) <= 8)
;
else
MISSING_CASE(GRAPHICS_VER(i915));
}
void intel_gt_init_workarounds(struct drm_i915_private *i915)
void intel_gt_init_workarounds(struct intel_gt *gt)
{
struct i915_wa_list *wal = &i915->gt_wa_list;
struct i915_wa_list *wal = &gt->wa_list;
wa_init_start(wal, "GT", "global");
gt_init_workarounds(i915, wal);
gt_init_workarounds(gt, wal);
wa_init_finish(wal);
}
@ -1278,7 +1360,7 @@ wa_list_apply(struct intel_gt *gt, const struct i915_wa_list *wal)
void intel_gt_apply_workarounds(struct intel_gt *gt)
{
wa_list_apply(gt, &gt->i915->gt_wa_list);
wa_list_apply(gt, &gt->wa_list);
}
static bool wa_list_verify(struct intel_gt *gt,
@ -1310,7 +1392,7 @@ static bool wa_list_verify(struct intel_gt *gt,
bool intel_gt_verify_workarounds(struct intel_gt *gt, const char *from)
{
return wa_list_verify(gt, &gt->i915->gt_wa_list, from);
return wa_list_verify(gt, &gt->wa_list, from);
}
__maybe_unused
@ -1604,6 +1686,31 @@ void intel_engine_apply_whitelist(struct intel_engine_cs *engine)
i915_mmio_reg_offset(RING_NOPID(base)));
}
/*
* engine_fake_wa_init(), a place holder to program the registers
* which are not part of an official workaround defined by the
* hardware team.
* Adding programming of those register inside workaround will
* allow utilizing wa framework to proper application and verification.
*/
static void
engine_fake_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
{
u8 mocs;
/*
* RING_CMD_CCTL are need to be programed to un-cached
* for memory writes and reads outputted by Command
* Streamers on Gen12 onward platforms.
*/
if (GRAPHICS_VER(engine->i915) >= 12) {
mocs = engine->gt->mocs.uc_index;
wa_masked_field_set(wal,
RING_CMD_CCTL(engine->mmio_base),
CMD_CCTL_MOCS_MASK,
CMD_CCTL_MOCS_OVERRIDE(mocs, mocs));
}
}
static void
rcs_engine_wa_init(struct intel_engine_cs *engine, struct i915_wa_list *wal)
{
@ -2044,6 +2151,8 @@ engine_init_workarounds(struct intel_engine_cs *engine, struct i915_wa_list *wal
if (I915_SELFTEST_ONLY(GRAPHICS_VER(engine->i915) < 4))
return;
engine_fake_wa_init(engine, wal);
if (engine->class == RENDER_CLASS)
rcs_engine_wa_init(engine, wal);
else
@ -2067,12 +2176,7 @@ void intel_engine_apply_workarounds(struct intel_engine_cs *engine)
wa_list_apply(engine->gt, &engine->wa_list);
}
struct mcr_range {
u32 start;
u32 end;
};
static const struct mcr_range mcr_ranges_gen8[] = {
static const struct i915_range mcr_ranges_gen8[] = {
{ .start = 0x5500, .end = 0x55ff },
{ .start = 0x7000, .end = 0x7fff },
{ .start = 0x9400, .end = 0x97ff },
@ -2081,7 +2185,7 @@ static const struct mcr_range mcr_ranges_gen8[] = {
{},
};
static const struct mcr_range mcr_ranges_gen12[] = {
static const struct i915_range mcr_ranges_gen12[] = {
{ .start = 0x8150, .end = 0x815f },
{ .start = 0x9520, .end = 0x955f },
{ .start = 0xb100, .end = 0xb3ff },
@ -2090,7 +2194,7 @@ static const struct mcr_range mcr_ranges_gen12[] = {
{},
};
static const struct mcr_range mcr_ranges_xehp[] = {
static const struct i915_range mcr_ranges_xehp[] = {
{ .start = 0x4000, .end = 0x4aff },
{ .start = 0x5200, .end = 0x52ff },
{ .start = 0x5400, .end = 0x7fff },
@ -2109,7 +2213,7 @@ static const struct mcr_range mcr_ranges_xehp[] = {
static bool mcr_range(struct drm_i915_private *i915, u32 offset)
{
const struct mcr_range *mcr_ranges;
const struct i915_range *mcr_ranges;
int i;
if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 50))

View File

@ -24,7 +24,7 @@ static inline void intel_wa_list_free(struct i915_wa_list *wal)
void intel_engine_init_ctx_wa(struct intel_engine_cs *engine);
int intel_engine_emit_ctx_wa(struct i915_request *rq);
void intel_gt_init_workarounds(struct drm_i915_private *i915);
void intel_gt_init_workarounds(struct intel_gt *gt);
void intel_gt_apply_workarounds(struct intel_gt *gt);
bool intel_gt_verify_workarounds(struct intel_gt *gt, const char *from);

View File

@ -376,6 +376,8 @@ int mock_engine_init(struct intel_engine_cs *engine)
{
struct intel_context *ce;
INIT_LIST_HEAD(&engine->pinned_contexts_list);
engine->sched_engine = i915_sched_engine_create(ENGINE_MOCK);
if (!engine->sched_engine)
return -ENOMEM;

View File

@ -290,7 +290,7 @@ static int live_heartbeat_fast(void *arg)
int err = 0;
/* Check that the heartbeat ticks at the desired rate. */
if (!IS_ACTIVE(CONFIG_DRM_I915_HEARTBEAT_INTERVAL))
if (!CONFIG_DRM_I915_HEARTBEAT_INTERVAL)
return 0;
for_each_engine(engine, gt, id) {
@ -352,7 +352,7 @@ static int live_heartbeat_off(void *arg)
int err = 0;
/* Check that we can turn off heartbeat and not interrupt VIP */
if (!IS_ACTIVE(CONFIG_DRM_I915_HEARTBEAT_INTERVAL))
if (!CONFIG_DRM_I915_HEARTBEAT_INTERVAL)
return 0;
for_each_engine(engine, gt, id) {

View File

@ -992,7 +992,7 @@ static int live_timeslice_preempt(void *arg)
* need to preempt the current task and replace it with another
* ready task.
*/
if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
if (!CONFIG_DRM_I915_TIMESLICE_DURATION)
return 0;
obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
@ -1122,7 +1122,7 @@ static int live_timeslice_rewind(void *arg)
* but only a few of those requests, forcing us to rewind the
* RING_TAIL of the original request.
*/
if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
if (!CONFIG_DRM_I915_TIMESLICE_DURATION)
return 0;
for_each_engine(engine, gt, id) {
@ -1299,7 +1299,7 @@ static int live_timeslice_queue(void *arg)
* ELSP[1] is already occupied, so must rely on timeslicing to
* eject ELSP[0] in favour of the queue.)
*/
if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
if (!CONFIG_DRM_I915_TIMESLICE_DURATION)
return 0;
obj = i915_gem_object_create_internal(gt->i915, PAGE_SIZE);
@ -1420,7 +1420,7 @@ static int live_timeslice_nopreempt(void *arg)
* We should not timeslice into a request that is marked with
* I915_REQUEST_NOPREEMPT.
*/
if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
if (!CONFIG_DRM_I915_TIMESLICE_DURATION)
return 0;
if (igt_spinner_init(&spin, gt))
@ -2260,7 +2260,7 @@ static int __cancel_hostile(struct live_preempt_cancel *arg)
int err;
/* Preempt cancel non-preemptible spinner in ELSP0 */
if (!IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT))
if (!CONFIG_DRM_I915_PREEMPT_TIMEOUT)
return 0;
if (!intel_has_reset_engine(arg->engine->gt))
@ -2316,7 +2316,7 @@ static int __cancel_fail(struct live_preempt_cancel *arg)
struct i915_request *rq;
int err;
if (!IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT))
if (!CONFIG_DRM_I915_PREEMPT_TIMEOUT)
return 0;
if (!intel_has_reset_engine(engine->gt))
@ -3375,7 +3375,7 @@ static int live_preempt_timeout(void *arg)
* Check that we force preemption to occur by cancelling the previous
* context if it refuses to yield the GPU.
*/
if (!IS_ACTIVE(CONFIG_DRM_I915_PREEMPT_TIMEOUT))
if (!CONFIG_DRM_I915_PREEMPT_TIMEOUT)
return 0;
if (!intel_has_reset_engine(gt))
@ -3493,7 +3493,7 @@ static int smoke_submit(struct preempt_smoke *smoke,
if (batch) {
struct i915_address_space *vm;
vm = i915_gem_context_get_vm_rcu(ctx);
vm = i915_gem_context_get_eb_vm(ctx);
vma = i915_vma_instance(batch, vm, NULL);
i915_vm_put(vm);
if (IS_ERR(vma))

View File

@ -117,7 +117,7 @@ static struct i915_request *
hang_create_request(struct hang *h, struct intel_engine_cs *engine)
{
struct intel_gt *gt = h->gt;
struct i915_address_space *vm = i915_gem_context_get_vm_rcu(h->ctx);
struct i915_address_space *vm = i915_gem_context_get_eb_vm(h->ctx);
struct drm_i915_gem_object *obj;
struct i915_request *rq = NULL;
struct i915_vma *hws, *vma;
@ -789,7 +789,7 @@ static int __igt_reset_engine(struct intel_gt *gt, bool active)
if (err)
pr_err("[%s] Wait for request %lld:%lld [0x%04X] failed: %d!\n",
engine->name, rq->fence.context,
rq->fence.seqno, rq->context->guc_id, err);
rq->fence.seqno, rq->context->guc_id.id, err);
}
skip:
@ -1098,7 +1098,7 @@ static int __igt_reset_engines(struct intel_gt *gt,
if (err)
pr_err("[%s] Wait for request %lld:%lld [0x%04X] failed: %d!\n",
engine->name, rq->fence.context,
rq->fence.seqno, rq->context->guc_id, err);
rq->fence.seqno, rq->context->guc_id.id, err);
}
count++;
@ -1108,7 +1108,7 @@ static int __igt_reset_engines(struct intel_gt *gt,
pr_err("i915_reset_engine(%s:%s): failed to reset request %lld:%lld [0x%04X]\n",
engine->name, test_name,
rq->fence.context,
rq->fence.seqno, rq->context->guc_id);
rq->fence.seqno, rq->context->guc_id.id);
i915_request_put(rq);
GEM_TRACE_DUMP();
@ -1596,7 +1596,7 @@ static int igt_reset_evict_ppgtt(void *arg)
if (INTEL_PPGTT(gt->i915) < INTEL_PPGTT_FULL)
return 0;
ppgtt = i915_ppgtt_create(gt);
ppgtt = i915_ppgtt_create(gt, 0);
if (IS_ERR(ppgtt))
return PTR_ERR(ppgtt);

View File

@ -66,7 +66,7 @@ reference_lists_init(struct intel_gt *gt, struct wa_lists *lists)
memset(lists, 0, sizeof(*lists));
wa_init_start(&lists->gt_wa_list, "GT_REF", "global");
gt_init_workarounds(gt->i915, &lists->gt_wa_list);
gt_init_workarounds(gt, &lists->gt_wa_list);
wa_init_finish(&lists->gt_wa_list);
for_each_engine(engine, gt, id) {

View File

@ -102,11 +102,11 @@ static_assert(sizeof(struct guc_ct_buffer_desc) == 64);
* | +-------+--------------------------------------------------------------+
* | | 7:0 | NUM_DWORDS = length (in dwords) of the embedded HXG message |
* +---+-------+--------------------------------------------------------------+
* | 1 | 31:0 | +--------------------------------------------------------+ |
* +---+-------+ | | |
* |...| | | Embedded `HXG Message`_ | |
* +---+-------+ | | |
* | n | 31:0 | +--------------------------------------------------------+ |
* | 1 | 31:0 | |
* +---+-------+ |
* |...| | [Embedded `HXG Message`_] |
* +---+-------+ |
* | n | 31:0 | |
* +---+-------+--------------------------------------------------------------+
*/

View File

@ -38,11 +38,11 @@
* +---+-------+--------------------------------------------------------------+
* | | Bits | Description |
* +===+=======+==============================================================+
* | 0 | 31:0 | +--------------------------------------------------------+ |
* +---+-------+ | | |
* |...| | | Embedded `HXG Message`_ | |
* +---+-------+ | | |
* | n | 31:0 | +--------------------------------------------------------+ |
* | 0 | 31:0 | |
* +---+-------+ |
* |...| | [Embedded `HXG Message`_] |
* +---+-------+ |
* | n | 31:0 | |
* +---+-------+--------------------------------------------------------------+
*/

View File

@ -3,6 +3,7 @@
* Copyright © 2014-2019 Intel Corporation
*/
#include "gem/i915_gem_lmem.h"
#include "gt/intel_gt.h"
#include "gt/intel_gt_irq.h"
#include "gt/intel_gt_pm_irq.h"
@ -647,7 +648,14 @@ struct i915_vma *intel_guc_allocate_vma(struct intel_guc *guc, u32 size)
u64 flags;
int ret;
obj = i915_gem_object_create_shmem(gt->i915, size);
if (HAS_LMEM(gt->i915))
obj = i915_gem_object_create_lmem(gt->i915, size,
I915_BO_ALLOC_CPU_CLEAR |
I915_BO_ALLOC_CONTIGUOUS |
I915_BO_ALLOC_PM_EARLY);
else
obj = i915_gem_object_create_shmem(gt->i915, size);
if (IS_ERR(obj))
return ERR_CAST(obj);

View File

@ -22,74 +22,121 @@
struct __guc_ads_blob;
/*
* Top level structure of GuC. It handles firmware loading and manages client
* pool. intel_guc owns a intel_guc_client to replace the legacy ExecList
* submission.
/**
* struct intel_guc - Top level structure of GuC.
*
* It handles firmware loading and manages client pool. intel_guc owns an
* i915_sched_engine for submission.
*/
struct intel_guc {
/** @fw: the GuC firmware */
struct intel_uc_fw fw;
/** @log: sub-structure containing GuC log related data and objects */
struct intel_guc_log log;
/** @ct: the command transport communication channel */
struct intel_guc_ct ct;
/** @slpc: sub-structure containing SLPC related data and objects */
struct intel_guc_slpc slpc;
/* Global engine used to submit requests to GuC */
/** @sched_engine: Global engine used to submit requests to GuC */
struct i915_sched_engine *sched_engine;
/**
* @stalled_request: if GuC can't process a request for any reason, we
* save it until GuC restarts processing. No other request can be
* submitted until the stalled request is processed.
*/
struct i915_request *stalled_request;
/* intel_guc_recv interrupt related state */
/** @irq_lock: protects GuC irq state */
spinlock_t irq_lock;
/**
* @msg_enabled_mask: mask of events that are processed when receiving
* an INTEL_GUC_ACTION_DEFAULT G2H message.
*/
unsigned int msg_enabled_mask;
/**
* @outstanding_submission_g2h: number of outstanding GuC to Host
* responses related to GuC submission, used to determine if the GT is
* idle
*/
atomic_t outstanding_submission_g2h;
/** @interrupts: pointers to GuC interrupt-managing functions. */
struct {
void (*reset)(struct intel_guc *guc);
void (*enable)(struct intel_guc *guc);
void (*disable)(struct intel_guc *guc);
} interrupts;
/*
* contexts_lock protects the pool of free guc ids and a linked list of
* guc ids available to be stolen
/**
* @contexts_lock: protects guc_ids, guc_id_list, ce->guc_id.id, and
* ce->guc_id.ref when transitioning in and out of zero
*/
spinlock_t contexts_lock;
/** @guc_ids: used to allocate unique ce->guc_id.id values */
struct ida guc_ids;
/**
* @guc_id_list: list of intel_context with valid guc_ids but no refs
*/
struct list_head guc_id_list;
/**
* @submission_supported: tracks whether we support GuC submission on
* the current platform
*/
bool submission_supported;
/** @submission_selected: tracks whether the user enabled GuC submission */
bool submission_selected;
/**
* @rc_supported: tracks whether we support GuC rc on the current platform
*/
bool rc_supported;
/** @rc_selected: tracks whether the user enabled GuC rc */
bool rc_selected;
/** @ads_vma: object allocated to hold the GuC ADS */
struct i915_vma *ads_vma;
/** @ads_blob: contents of the GuC ADS */
struct __guc_ads_blob *ads_blob;
/** @ads_regset_size: size of the save/restore regsets in the ADS */
u32 ads_regset_size;
/** @ads_golden_ctxt_size: size of the golden contexts in the ADS */
u32 ads_golden_ctxt_size;
/** @lrc_desc_pool: object allocated to hold the GuC LRC descriptor pool */
struct i915_vma *lrc_desc_pool;
/** @lrc_desc_pool_vaddr: contents of the GuC LRC descriptor pool */
void *lrc_desc_pool_vaddr;
/* guc_id to intel_context lookup */
/**
* @context_lookup: used to resolve intel_context from guc_id, if a
* context is present in this structure it is registered with the GuC
*/
struct xarray context_lookup;
/* Control params for fw initialization */
/** @params: Control params for fw initialization */
u32 params[GUC_CTL_MAX_DWORDS];
/* GuC's FW specific registers used in MMIO send */
/** @send_regs: GuC's FW specific registers used for sending MMIO H2G */
struct {
u32 base;
unsigned int count;
enum forcewake_domains fw_domains;
} send_regs;
/* register used to send interrupts to the GuC FW */
/** @notify_reg: register used to send interrupts to the GuC FW */
i915_reg_t notify_reg;
/* Store msg (e.g. log flush) that we see while CTBs are disabled */
/**
* @mmio_msg: notification bitmask that the GuC writes in one of its
* registers when the CT channel is disabled, to be processed when the
* channel is back up.
*/
u32 mmio_msg;
/* To serialize the intel_guc_send actions */
/** @send_mutex: used to serialize the intel_guc_send actions */
struct mutex send_mutex;
};

View File

@ -349,6 +349,8 @@ static void fill_engine_enable_masks(struct intel_gt *gt,
info->engine_enabled_masks[GUC_VIDEOENHANCE_CLASS] = VEBOX_MASK(gt);
}
#define LR_HW_CONTEXT_SIZE (80 * sizeof(u32))
#define LRC_SKIP_SIZE (LRC_PPHWSP_SZ * PAGE_SIZE + LR_HW_CONTEXT_SIZE)
static int guc_prep_golden_context(struct intel_guc *guc,
struct __guc_ads_blob *blob)
{
@ -396,7 +398,18 @@ static int guc_prep_golden_context(struct intel_guc *guc,
if (!blob)
continue;
blob->ads.eng_state_size[guc_class] = real_size;
/*
* This interface is slightly confusing. We need to pass the
* base address of the full golden context and the size of just
* the engine state, which is the section of the context image
* that starts after the execlists context. This is required to
* allow the GuC to restore just the engine state when a
* watchdog reset occurs.
* We calculate the engine state size by removing the size of
* what comes before it in the context image (which is identical
* on all engines).
*/
blob->ads.eng_state_size[guc_class] = real_size - LRC_SKIP_SIZE;
blob->ads.golden_context_lrca[guc_class] = addr_ggtt;
addr_ggtt += alloc_size;
}
@ -436,11 +449,6 @@ static void guc_init_golden_context(struct intel_guc *guc)
u8 engine_class, guc_class;
u8 *ptr;
/* Skip execlist and PPGTT registers + HWSP */
const u32 lr_hw_context_size = 80 * sizeof(u32);
const u32 skip_size = LRC_PPHWSP_SZ * PAGE_SIZE +
lr_hw_context_size;
if (!intel_uc_uses_guc_submission(&gt->uc))
return;
@ -476,12 +484,12 @@ static void guc_init_golden_context(struct intel_guc *guc)
continue;
}
GEM_BUG_ON(blob->ads.eng_state_size[guc_class] != real_size);
GEM_BUG_ON(blob->ads.eng_state_size[guc_class] !=
real_size - LRC_SKIP_SIZE);
GEM_BUG_ON(blob->ads.golden_context_lrca[guc_class] != addr_ggtt);
addr_ggtt += alloc_size;
shmem_read(engine->default_state, skip_size, ptr + skip_size,
real_size - skip_size);
shmem_read(engine->default_state, 0, ptr, real_size);
ptr += alloc_size;
}

View File

@ -168,12 +168,15 @@ static int guc_action_register_ct_buffer(struct intel_guc *guc, u32 type,
FIELD_PREP(HOST2GUC_REGISTER_CTB_REQUEST_MSG_2_DESC_ADDR, desc_addr),
FIELD_PREP(HOST2GUC_REGISTER_CTB_REQUEST_MSG_3_BUFF_ADDR, buff_addr),
};
int ret;
GEM_BUG_ON(type != GUC_CTB_TYPE_HOST2GUC && type != GUC_CTB_TYPE_GUC2HOST);
GEM_BUG_ON(size % SZ_4K);
/* CT registration must go over MMIO */
return intel_guc_send_mmio(guc, request, ARRAY_SIZE(request), NULL, 0);
ret = intel_guc_send_mmio(guc, request, ARRAY_SIZE(request), NULL, 0);
return ret > 0 ? -EPROTO : ret;
}
static int ct_register_buffer(struct intel_guc_ct *ct, u32 type,
@ -188,8 +191,8 @@ static int ct_register_buffer(struct intel_guc_ct *ct, u32 type,
err = guc_action_register_ct_buffer(ct_to_guc(ct), type,
desc_addr, buff_addr, size);
if (unlikely(err))
CT_ERROR(ct, "Failed to register %s buffer (err=%d)\n",
guc_ct_buffer_type_to_str(type), err);
CT_ERROR(ct, "Failed to register %s buffer (%pe)\n",
guc_ct_buffer_type_to_str(type), ERR_PTR(err));
return err;
}
@ -201,11 +204,14 @@ static int guc_action_deregister_ct_buffer(struct intel_guc *guc, u32 type)
FIELD_PREP(GUC_HXG_REQUEST_MSG_0_ACTION, GUC_ACTION_HOST2GUC_DEREGISTER_CTB),
FIELD_PREP(HOST2GUC_DEREGISTER_CTB_REQUEST_MSG_1_TYPE, type),
};
int ret;
GEM_BUG_ON(type != GUC_CTB_TYPE_HOST2GUC && type != GUC_CTB_TYPE_GUC2HOST);
/* CT deregistration must go over MMIO */
return intel_guc_send_mmio(guc, request, ARRAY_SIZE(request), NULL, 0);
ret = intel_guc_send_mmio(guc, request, ARRAY_SIZE(request), NULL, 0);
return ret > 0 ? -EPROTO : ret;
}
static int ct_deregister_buffer(struct intel_guc_ct *ct, u32 type)
@ -213,8 +219,8 @@ static int ct_deregister_buffer(struct intel_guc_ct *ct, u32 type)
int err = guc_action_deregister_ct_buffer(ct_to_guc(ct), type);
if (unlikely(err))
CT_ERROR(ct, "Failed to deregister %s buffer (err=%d)\n",
guc_ct_buffer_type_to_str(type), err);
CT_ERROR(ct, "Failed to deregister %s buffer (%pe)\n",
guc_ct_buffer_type_to_str(type), ERR_PTR(err));
return err;
}
@ -522,9 +528,6 @@ static int wait_for_ct_request_update(struct ct_request *req, u32 *status)
err = wait_for(done, GUC_CTB_RESPONSE_TIMEOUT_LONG_MS);
#undef done
if (unlikely(err))
DRM_ERROR("CT: fence %u err %d\n", req->fence, err);
*status = req->status;
return err;
}
@ -722,8 +725,11 @@ static int ct_send(struct intel_guc_ct *ct,
err = wait_for_ct_request_update(&request, status);
g2h_release_space(ct, GUC_CTB_HXG_MSG_MAX_LEN);
if (unlikely(err))
if (unlikely(err)) {
CT_ERROR(ct, "No response for request %#x (fence %u)\n",
action[0], request.fence);
goto unlink;
}
if (FIELD_GET(GUC_HXG_MSG_0_TYPE, *status) != GUC_HXG_TYPE_RESPONSE_SUCCESS) {
err = -EIO;
@ -775,8 +781,8 @@ int intel_guc_ct_send(struct intel_guc_ct *ct, const u32 *action, u32 len,
ret = ct_send(ct, action, len, response_buf, response_buf_size, &status);
if (unlikely(ret < 0)) {
CT_ERROR(ct, "Sending action %#x failed (err=%d status=%#X)\n",
action[0], ret, status);
CT_ERROR(ct, "Sending action %#x failed (%pe) status=%#X\n",
action[0], ERR_PTR(ret), status);
} else if (unlikely(ret)) {
CT_DEBUG(ct, "send action %#x returned %d (%#x)\n",
action[0], ret, ret);
@ -1042,9 +1048,9 @@ static void ct_incoming_request_worker_func(struct work_struct *w)
container_of(w, struct intel_guc_ct, requests.worker);
bool done;
done = ct_process_incoming_requests(ct);
if (!done)
queue_work(system_unbound_wq, &ct->requests.worker);
do {
done = ct_process_incoming_requests(ct);
} while (!done);
}
static int ct_handle_event(struct intel_guc_ct *ct, struct ct_incoming_msg *request)

View File

@ -5,14 +5,14 @@
#include <drm/drm_print.h>
#include "gt/debugfs_gt.h"
#include "gt/intel_gt_debugfs.h"
#include "gt/uc/intel_guc_ads.h"
#include "gt/uc/intel_guc_ct.h"
#include "gt/uc/intel_guc_slpc.h"
#include "gt/uc/intel_guc_submission.h"
#include "intel_guc.h"
#include "intel_guc_debugfs.h"
#include "intel_guc_log_debugfs.h"
#include "gt/uc/intel_guc_ct.h"
#include "gt/uc/intel_guc_ads.h"
#include "gt/uc/intel_guc_submission.h"
#include "gt/uc/intel_guc_slpc.h"
static int guc_info_show(struct seq_file *m, void *data)
{
@ -35,7 +35,7 @@ static int guc_info_show(struct seq_file *m, void *data)
return 0;
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(guc_info);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(guc_info);
static int guc_registered_contexts_show(struct seq_file *m, void *data)
{
@ -49,7 +49,7 @@ static int guc_registered_contexts_show(struct seq_file *m, void *data)
return 0;
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(guc_registered_contexts);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(guc_registered_contexts);
static int guc_slpc_info_show(struct seq_file *m, void *unused)
{
@ -62,7 +62,7 @@ static int guc_slpc_info_show(struct seq_file *m, void *unused)
return intel_guc_slpc_print_info(slpc, &p);
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(guc_slpc_info);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(guc_slpc_info);
static bool intel_eval_slpc_support(void *data)
{
@ -73,7 +73,7 @@ static bool intel_eval_slpc_support(void *data)
void intel_guc_debugfs_register(struct intel_guc *guc, struct dentry *root)
{
static const struct debugfs_gt_file files[] = {
static const struct intel_gt_debugfs_file files[] = {
{ "guc_info", &guc_info_fops, NULL },
{ "guc_registered_contexts", &guc_registered_contexts_fops, NULL },
{ "guc_slpc_info", &guc_slpc_info_fops, &intel_eval_slpc_support},

View File

@ -41,18 +41,21 @@ static void guc_prepare_xfer(struct intel_uncore *uncore)
}
/* Copy RSA signature from the fw image to HW for verification */
static void guc_xfer_rsa(struct intel_uc_fw *guc_fw,
struct intel_uncore *uncore)
static int guc_xfer_rsa(struct intel_uc_fw *guc_fw,
struct intel_uncore *uncore)
{
u32 rsa[UOS_RSA_SCRATCH_COUNT];
size_t copied;
int i;
copied = intel_uc_fw_copy_rsa(guc_fw, rsa, sizeof(rsa));
GEM_BUG_ON(copied < sizeof(rsa));
if (copied < sizeof(rsa))
return -ENOMEM;
for (i = 0; i < UOS_RSA_SCRATCH_COUNT; i++)
intel_uncore_write(uncore, UOS_RSA_SCRATCH(i), rsa[i]);
return 0;
}
/*
@ -141,7 +144,9 @@ int intel_guc_fw_upload(struct intel_guc *guc)
* by the DMA engine in one operation, whereas the RSA signature is
* loaded via MMIO.
*/
guc_xfer_rsa(&guc->fw, uncore);
ret = guc_xfer_rsa(&guc->fw, uncore);
if (ret)
goto out;
/*
* Current uCode expects the code to be loaded at 8k; locations below

View File

@ -6,7 +6,7 @@
#include <linux/fs.h>
#include <drm/drm_print.h>
#include "gt/debugfs_gt.h"
#include "gt/intel_gt_debugfs.h"
#include "intel_guc.h"
#include "intel_guc_log.h"
#include "intel_guc_log_debugfs.h"
@ -17,7 +17,7 @@ static int guc_log_dump_show(struct seq_file *m, void *data)
return intel_guc_log_dump(m->private, &p, false);
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(guc_log_dump);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(guc_log_dump);
static int guc_load_err_log_dump_show(struct seq_file *m, void *data)
{
@ -25,7 +25,7 @@ static int guc_load_err_log_dump_show(struct seq_file *m, void *data)
return intel_guc_log_dump(m->private, &p, true);
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(guc_load_err_log_dump);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(guc_load_err_log_dump);
static int guc_log_level_get(void *data, u64 *val)
{
@ -109,7 +109,7 @@ static const struct file_operations guc_log_relay_fops = {
void intel_guc_log_debugfs_register(struct intel_guc_log *log,
struct dentry *root)
{
static const struct debugfs_gt_file files[] = {
static const struct intel_gt_debugfs_file files[] = {
{ "guc_log_dump", &guc_log_dump_fops, NULL },
{ "guc_load_err_log_dump", &guc_load_err_log_dump_fops, NULL },
{ "guc_log_level", &guc_log_level_fops, NULL },

File diff suppressed because it is too large Load Diff

View File

@ -87,17 +87,25 @@ static int intel_huc_rsa_data_create(struct intel_huc *huc)
vma->obj, true));
if (IS_ERR(vaddr)) {
i915_vma_unpin_and_release(&vma, 0);
return PTR_ERR(vaddr);
err = PTR_ERR(vaddr);
goto unpin_out;
}
copied = intel_uc_fw_copy_rsa(&huc->fw, vaddr, vma->size);
GEM_BUG_ON(copied < huc->fw.rsa_size);
i915_gem_object_unpin_map(vma->obj);
if (copied < huc->fw.rsa_size) {
err = -ENOMEM;
goto unpin_out;
}
huc->rsa_data = vma;
return 0;
unpin_out:
i915_vma_unpin_and_release(&vma, 0);
return err;
}
static void intel_huc_rsa_data_destroy(struct intel_huc *huc)

View File

@ -5,7 +5,7 @@
#include <drm/drm_print.h>
#include "gt/debugfs_gt.h"
#include "gt/intel_gt_debugfs.h"
#include "intel_huc.h"
#include "intel_huc_debugfs.h"
@ -21,11 +21,11 @@ static int huc_info_show(struct seq_file *m, void *data)
return 0;
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(huc_info);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(huc_info);
void intel_huc_debugfs_register(struct intel_huc *huc, struct dentry *root)
{
static const struct debugfs_gt_file files[] = {
static const struct intel_gt_debugfs_file files[] = {
{ "huc_info", &huc_info_fops, NULL },
};

View File

@ -35,7 +35,7 @@ static void uc_expand_default_options(struct intel_uc *uc)
}
/* Intermediate platforms are HuC authentication only */
if (IS_DG1(i915) || IS_ALDERLAKE_S(i915)) {
if (IS_ALDERLAKE_S(i915)) {
i915->params.enable_guc = ENABLE_GUC_LOAD_HUC;
return;
}

View File

@ -6,7 +6,7 @@
#include <linux/debugfs.h>
#include <drm/drm_print.h>
#include "gt/debugfs_gt.h"
#include "gt/intel_gt_debugfs.h"
#include "intel_guc_debugfs.h"
#include "intel_huc_debugfs.h"
#include "intel_uc.h"
@ -32,11 +32,11 @@ static int uc_usage_show(struct seq_file *m, void *data)
return 0;
}
DEFINE_GT_DEBUGFS_ATTRIBUTE(uc_usage);
DEFINE_INTEL_GT_DEBUGFS_ATTRIBUTE(uc_usage);
void intel_uc_debugfs_register(struct intel_uc *uc, struct dentry *gt_root)
{
static const struct debugfs_gt_file files[] = {
static const struct intel_gt_debugfs_file files[] = {
{ "usage", &uc_usage_fops, NULL },
};
struct dentry *root;

View File

@ -7,6 +7,7 @@
#include <linux/firmware.h>
#include <drm/drm_print.h>
#include "gem/i915_gem_lmem.h"
#include "intel_uc_fw.h"
#include "intel_uc_fw_abi.h"
#include "i915_drv.h"
@ -50,6 +51,7 @@ void intel_uc_fw_change_status(struct intel_uc_fw *uc_fw,
#define INTEL_UC_FIRMWARE_DEFS(fw_def, guc_def, huc_def) \
fw_def(ALDERLAKE_P, 0, guc_def(adlp, 62, 0, 3), huc_def(tgl, 7, 9, 3)) \
fw_def(ALDERLAKE_S, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl, 7, 9, 3)) \
fw_def(DG1, 0, guc_def(dg1, 62, 0, 0), huc_def(dg1, 7, 9, 3)) \
fw_def(ROCKETLAKE, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl, 7, 9, 3)) \
fw_def(TIGERLAKE, 0, guc_def(tgl, 62, 0, 0), huc_def(tgl, 7, 9, 3)) \
fw_def(JASPERLAKE, 0, guc_def(ehl, 62, 0, 0), huc_def(ehl, 9, 0, 0)) \
@ -370,7 +372,14 @@ int intel_uc_fw_fetch(struct intel_uc_fw *uc_fw)
if (uc_fw->type == INTEL_UC_FW_TYPE_GUC)
uc_fw->private_data_size = css->private_data_size;
obj = i915_gem_object_create_shmem_from_data(i915, fw->data, fw->size);
if (HAS_LMEM(i915)) {
obj = i915_gem_object_create_lmem_from_data(i915, fw->data, fw->size);
if (!IS_ERR(obj))
obj->flags |= I915_BO_ALLOC_PM_EARLY;
} else {
obj = i915_gem_object_create_shmem_from_data(i915, fw->data, fw->size);
}
if (IS_ERR(obj)) {
err = PTR_ERR(obj);
goto fail;
@ -413,20 +422,25 @@ static void uc_fw_bind_ggtt(struct intel_uc_fw *uc_fw)
{
struct drm_i915_gem_object *obj = uc_fw->obj;
struct i915_ggtt *ggtt = __uc_fw_to_gt(uc_fw)->ggtt;
struct i915_vma dummy = {
.node.start = uc_fw_ggtt_offset(uc_fw),
.node.size = obj->base.size,
.pages = obj->mm.pages,
.vm = &ggtt->vm,
};
struct i915_vma *dummy = &uc_fw->dummy;
u32 pte_flags = 0;
dummy->node.start = uc_fw_ggtt_offset(uc_fw);
dummy->node.size = obj->base.size;
dummy->pages = obj->mm.pages;
dummy->vm = &ggtt->vm;
GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
GEM_BUG_ON(dummy.node.size > ggtt->uc_fw.size);
GEM_BUG_ON(dummy->node.size > ggtt->uc_fw.size);
/* uc_fw->obj cache domains were not controlled across suspend */
drm_clflush_sg(dummy.pages);
if (i915_gem_object_has_struct_page(obj))
drm_clflush_sg(dummy->pages);
ggtt->vm.insert_entries(&ggtt->vm, &dummy, I915_CACHE_NONE, 0);
if (i915_gem_object_is_lmem(obj))
pte_flags |= PTE_LM;
ggtt->vm.insert_entries(&ggtt->vm, dummy, I915_CACHE_NONE, pte_flags);
}
static void uc_fw_unbind_ggtt(struct intel_uc_fw *uc_fw)
@ -585,13 +599,68 @@ void intel_uc_fw_cleanup_fetch(struct intel_uc_fw *uc_fw)
*/
size_t intel_uc_fw_copy_rsa(struct intel_uc_fw *uc_fw, void *dst, u32 max_len)
{
struct sg_table *pages = uc_fw->obj->mm.pages;
struct intel_memory_region *mr = uc_fw->obj->mm.region;
u32 size = min_t(u32, uc_fw->rsa_size, max_len);
u32 offset = sizeof(struct uc_css_header) + uc_fw->ucode_size;
struct sgt_iter iter;
size_t count = 0;
int idx;
/* Called during reset handling, must be atomic [no fs_reclaim] */
GEM_BUG_ON(!intel_uc_fw_is_available(uc_fw));
return sg_pcopy_to_buffer(pages->sgl, pages->nents, dst, size, offset);
idx = offset >> PAGE_SHIFT;
offset = offset_in_page(offset);
if (i915_gem_object_has_struct_page(uc_fw->obj)) {
struct page *page;
for_each_sgt_page(page, iter, uc_fw->obj->mm.pages) {
u32 len = min_t(u32, size, PAGE_SIZE - offset);
void *vaddr;
if (idx > 0) {
idx--;
continue;
}
vaddr = kmap_atomic(page);
memcpy(dst, vaddr + offset, len);
kunmap_atomic(vaddr);
offset = 0;
dst += len;
size -= len;
count += len;
if (!size)
break;
}
} else {
dma_addr_t addr;
for_each_sgt_daddr(addr, iter, uc_fw->obj->mm.pages) {
u32 len = min_t(u32, size, PAGE_SIZE - offset);
void __iomem *vaddr;
if (idx > 0) {
idx--;
continue;
}
vaddr = io_mapping_map_atomic_wc(&mr->iomap,
addr - mr->region.start);
memcpy_fromio(dst, vaddr + offset, len);
io_mapping_unmap_atomic(vaddr);
offset = 0;
dst += len;
size -= len;
count += len;
if (!size)
break;
}
}
return count;
}
/**

View File

@ -10,6 +10,7 @@
#include "intel_uc_fw_abi.h"
#include "intel_device_info.h"
#include "i915_gem.h"
#include "i915_vma.h"
struct drm_printer;
struct drm_i915_private;
@ -75,6 +76,14 @@ struct intel_uc_fw {
bool user_overridden;
size_t size;
struct drm_i915_gem_object *obj;
/**
* @dummy: A vma used in binding the uc fw to ggtt. We can't define this
* vma on the stack as it can lead to a stack overflow, so we define it
* here. Safe to have 1 copy per uc fw because the binding is single
* threaded as it done during driver load (inherently single threaded)
* or during a GT reset (mutex guarantees single threaded).
*/
struct i915_vma dummy;
/*
* The firmware build process will generate a version header file with major and

View File

@ -0,0 +1,127 @@
// SPDX-License-Identifier: MIT
/*
* Copyright <EFBFBD><EFBFBD> 2021 Intel Corporation
*/
#include "selftests/intel_scheduler_helpers.h"
static struct i915_request *nop_user_request(struct intel_context *ce,
struct i915_request *from)
{
struct i915_request *rq;
int ret;
rq = intel_context_create_request(ce);
if (IS_ERR(rq))
return rq;
if (from) {
ret = i915_sw_fence_await_dma_fence(&rq->submit,
&from->fence, 0,
I915_FENCE_GFP);
if (ret < 0) {
i915_request_put(rq);
return ERR_PTR(ret);
}
}
i915_request_get(rq);
i915_request_add(rq);
return rq;
}
static int intel_guc_scrub_ctbs(void *arg)
{
struct intel_gt *gt = arg;
int ret = 0;
int i;
struct i915_request *last[3] = {NULL, NULL, NULL}, *rq;
intel_wakeref_t wakeref;
struct intel_engine_cs *engine;
struct intel_context *ce;
wakeref = intel_runtime_pm_get(gt->uncore->rpm);
engine = intel_selftest_find_any_engine(gt);
/* Submit requests and inject errors forcing G2H to be dropped */
for (i = 0; i < 3; ++i) {
ce = intel_context_create(engine);
if (IS_ERR(ce)) {
ret = PTR_ERR(ce);
pr_err("Failed to create context, %d: %d\n", i, ret);
goto err;
}
switch (i) {
case 0:
ce->drop_schedule_enable = true;
break;
case 1:
ce->drop_schedule_disable = true;
break;
case 2:
ce->drop_deregister = true;
break;
}
rq = nop_user_request(ce, NULL);
intel_context_put(ce);
if (IS_ERR(rq)) {
ret = PTR_ERR(rq);
pr_err("Failed to create request, %d: %d\n", i, ret);
goto err;
}
last[i] = rq;
}
for (i = 0; i < 3; ++i) {
ret = i915_request_wait(last[i], 0, HZ);
if (ret < 0) {
pr_err("Last request failed to complete: %d\n", ret);
goto err;
}
i915_request_put(last[i]);
last[i] = NULL;
}
/* Force all H2G / G2H to be submitted / processed */
intel_gt_retire_requests(gt);
msleep(500);
/* Scrub missing G2H */
intel_gt_handle_error(engine->gt, -1, 0, "selftest reset");
/* GT will not idle if G2H are lost */
ret = intel_gt_wait_for_idle(gt, HZ);
if (ret < 0) {
pr_err("GT failed to idle: %d\n", ret);
goto err;
}
err:
for (i = 0; i < 3; ++i)
if (last[i])
i915_request_put(last[i]);
intel_runtime_pm_put(gt->uncore->rpm, wakeref);
return ret;
}
int intel_guc_live_selftests(struct drm_i915_private *i915)
{
static const struct i915_subtest tests[] = {
SUBTEST(intel_guc_scrub_ctbs),
};
struct intel_gt *gt = &i915->gt;
if (intel_gt_is_wedged(gt))
return 0;
if (!intel_uc_uses_guc_submission(&gt->uc))
return 0;
return intel_gt_live_subtests(tests, gt);
}

View File

@ -745,7 +745,7 @@ static void ppgtt_free_spt(struct intel_vgpu_ppgtt_spt *spt)
trace_spt_free(spt->vgpu->id, spt, spt->guest_page.type);
dma_unmap_page(kdev, spt->shadow_page.mfn << I915_GTT_PAGE_SHIFT, 4096,
PCI_DMA_BIDIRECTIONAL);
DMA_BIDIRECTIONAL);
radix_tree_delete(&spt->vgpu->gtt.spt_tree, spt->shadow_page.mfn);
@ -849,7 +849,7 @@ static struct intel_vgpu_ppgtt_spt *ppgtt_alloc_spt(
*/
spt->shadow_page.type = type;
daddr = dma_map_page(kdev, spt->shadow_page.page,
0, 4096, PCI_DMA_BIDIRECTIONAL);
0, 4096, DMA_BIDIRECTIONAL);
if (dma_mapping_error(kdev, daddr)) {
gvt_vgpu_err("fail to map dma addr\n");
ret = -EINVAL;
@ -865,7 +865,7 @@ static struct intel_vgpu_ppgtt_spt *ppgtt_alloc_spt(
return spt;
err_unmap_dma:
dma_unmap_page(kdev, daddr, PAGE_SIZE, PCI_DMA_BIDIRECTIONAL);
dma_unmap_page(kdev, daddr, PAGE_SIZE, DMA_BIDIRECTIONAL);
err_free_spt:
free_spt(spt);
return ERR_PTR(ret);
@ -2409,8 +2409,7 @@ static int alloc_scratch_pages(struct intel_vgpu *vgpu,
return -ENOMEM;
}
daddr = dma_map_page(dev, virt_to_page(scratch_pt), 0,
4096, PCI_DMA_BIDIRECTIONAL);
daddr = dma_map_page(dev, virt_to_page(scratch_pt), 0, 4096, DMA_BIDIRECTIONAL);
if (dma_mapping_error(dev, daddr)) {
gvt_vgpu_err("fail to dmamap scratch_pt\n");
__free_page(virt_to_page(scratch_pt));
@ -2461,7 +2460,7 @@ static int release_scratch_page_tree(struct intel_vgpu *vgpu)
if (vgpu->gtt.scratch_pt[i].page != NULL) {
daddr = (dma_addr_t)(vgpu->gtt.scratch_pt[i].page_mfn <<
I915_GTT_PAGE_SHIFT);
dma_unmap_page(dev, daddr, 4096, PCI_DMA_BIDIRECTIONAL);
dma_unmap_page(dev, daddr, 4096, DMA_BIDIRECTIONAL);
__free_page(vgpu->gtt.scratch_pt[i].page);
vgpu->gtt.scratch_pt[i].page = NULL;
vgpu->gtt.scratch_pt[i].page_mfn = 0;
@ -2741,7 +2740,7 @@ int intel_gvt_init_gtt(struct intel_gvt *gvt)
}
daddr = dma_map_page(dev, virt_to_page(page), 0,
4096, PCI_DMA_BIDIRECTIONAL);
4096, DMA_BIDIRECTIONAL);
if (dma_mapping_error(dev, daddr)) {
gvt_err("fail to dmamap scratch ggtt page\n");
__free_page(virt_to_page(page));
@ -2755,7 +2754,7 @@ int intel_gvt_init_gtt(struct intel_gvt *gvt)
ret = setup_spt_oos(gvt);
if (ret) {
gvt_err("fail to initialize SPT oos\n");
dma_unmap_page(dev, daddr, 4096, PCI_DMA_BIDIRECTIONAL);
dma_unmap_page(dev, daddr, 4096, DMA_BIDIRECTIONAL);
__free_page(gvt->gtt.scratch_page);
return ret;
}
@ -2779,7 +2778,7 @@ void intel_gvt_clean_gtt(struct intel_gvt *gvt)
dma_addr_t daddr = (dma_addr_t)(gvt->gtt.scratch_mfn <<
I915_GTT_PAGE_SHIFT);
dma_unmap_page(dev, daddr, 4096, PCI_DMA_BIDIRECTIONAL);
dma_unmap_page(dev, daddr, 4096, DMA_BIDIRECTIONAL);
__free_page(gvt->gtt.scratch_page);

Some files were not shown because too many files have changed in this diff Show More