linux

mirror of https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git synced 2025-01-16 01:54:00 +00:00

Author	SHA1	Message	Date
Dave Airlie	f48ab0a39f	amd-drm-fixes-6.12-2024-11-16: amdgpu: - Revert a swsmu patch to fix a regression -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZzivywAKCRC93/aFa7yZ 2DjYAQD507C9yHfjLWMrpqQIQsnMK5XRobiq8Fdqcu6iDkS8aQEAjM6a7pCIkjpE Q3vlFgcJcEGUrynDL7T+jVOIXLn8FAM= =bM6j -----END PGP SIGNATURE----- Merge tag 'amd-drm-fixes-6.12-2024-11-16' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.12-2024-11-16: amdgpu: - Revert a swsmu patch to fix a regression Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexdeucher@gmail.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241116145320.2507156-1-alexander.deucher@amd.com	2024-11-17 08:12:48 +10:00
Alex Deucher	44f392fbf6	Revert "drm/amd/pm: correct the workload setting" This reverts commit 74e1006430a5377228e49310f6d915628609929e. This causes a regression in the workload selection. A more extensive fix is being worked on. For now, revert. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3618 Fixes: 74e1006430a5 ("drm/amd/pm: correct the workload setting") Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-16 09:41:11 -05:00
Dave Airlie	21c1c6c7d7	Driver Changes: - Fix unlock on exec ioctl error path (Matthew Brost) - Fix hibernation on LNL due to ggtt getting lost (Matthew Brost / Matthew Auld) - Fix missing runtime PM in OA release (Ashutosh) -----BEGIN PGP SIGNATURE----- iQJNBAABCAA3FiEE6rM8lpABPHM5FqyDm6KlpjDL6lMFAmc2jg4ZHGx1Y2FzLmRl bWFyY2hpQGludGVsLmNvbQAKCRCboqWmMMvqU/gxD/4xKW6qHJe3S9UwfIR7CPtw myrcxfwHG5ryzZsmGh6mwaFxYdGQ4ibdWAKZGF0NrZUsKig+8ADRRE7PoOyyZDV9 0uZ8lOIMPfMPZ8tHvQHbu7l/8LGXcV/R27ts7Rr9vGr2Ox8/5NTGtzykhUDb2xiO qZS4BPF/uuis9MLJRwoAiI2vwrUfFg16To+1NX3rmtBvzbH4DdY9l7QQV5B+22Rc UDQVRAUTXl3ER7TtwwVsHzuctsWJb/5H+1vM6QNIltWWTXPBA8qVC9092SwIWOoF 7c4Ap440V8AHyjjnzSM0N2rO86L6Mo5QY04gv4y7dywKHT+JW95QNWKPizJJeIHd to62AQdlVDCLQqc4jEpIPEorF5wP/qw8Polwa4eTlsGpGMoUkcRVEXO8JA8Rd7Rt EtHWb+ZFproBYtAvrCUyUMlmaviLK+eMsEaNlt+uQqQP3ZAfPvcItcxtSXn3Jsfw 47v/jvP3OddZgOmPHqo1jfZBqACfKt1trqSg2YYOwfGLaZ+KhgZz7rw8n5WKSmyK XEa5y9stffM/3g9PIvroVQc/INAhS++WhXnGRZ/fcxQrFrKQ7sFMXFUs1OUbgv7j GUVZ4V/BGJy/o58bblSla/NSwiX2bHB4d8grS5rtKMMNe9PxgG3ldn8YG4cFADmP NnR/YsBOf1mjvfdiUUMmxQ== =VWVg -----END PGP SIGNATURE----- Merge tag 'drm-xe-fixes-2024-11-14' of https://gitlab.freedesktop.org/drm/xe/kernel into drm-fixes Driver Changes: - Fix unlock on exec ioctl error path (Matthew Brost) - Fix hibernation on LNL due to ggtt getting lost (Matthew Brost / Matthew Auld) - Fix missing runtime PM in OA release (Ashutosh) Signed-off-by: Dave Airlie <airlied@redhat.com> From: Lucas De Marchi <lucas.demarchi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/5ntcf2ssmmvo5dsf2mdcee4guwwmpbm3xrlufgt2pdfmznzjo3@62ygo3bxkock	2024-11-16 04:31:54 +10:00
Dave Airlie	1eb0de899b	amd-drm-fixes-6.12-2024-11-14: amdgpu: - PSR fix - Panel replay fixes - DML fix - vblank power fix - Fix video caps - SMU 14.0 fix - GPUVM fix - MES 12 fix - APU carve out fix - DC vbios fix - NBIO fix -----BEGIN PGP SIGNATURE----- iHUEABYKAB0WIQQgO5Idg2tXNTSZAr293/aFa7yZ2AUCZzYIpwAKCRC93/aFa7yZ 2NVgAQDz3MsCT5vRhIkjBQ8VJqi/k23vvzoHRvBWE7JrfejmuQD+IvDPKmXvneQK M+3vSqDIbpmfwXEghHBdWXTMJqPZNg8= =UkPM -----END PGP SIGNATURE----- Merge tag 'amd-drm-fixes-6.12-2024-11-14' of https://gitlab.freedesktop.org/agd5f/linux into drm-fixes amd-drm-fixes-6.12-2024-11-14: amdgpu: - PSR fix - Panel replay fixes - DML fix - vblank power fix - Fix video caps - SMU 14.0 fix - GPUVM fix - MES 12 fix - APU carve out fix - DC vbios fix - NBIO fix Signed-off-by: Dave Airlie <airlied@redhat.com> From: Alex Deucher <alexander.deucher@amd.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241114143401.448210-1-alexander.deucher@amd.com	2024-11-15 06:48:57 +10:00
Dave Airlie	99d051c4b3	Short summary of fixes pull: bridge: - tc358768: Fix DSI command tx nouveau: - Fix GSP AUX error handling - dp: Handle retires for AUX CH transfers with GSP - fw: Sync DMA after setup panthor: - Fix partial BO mappings to GPU rockchip: - vop: Avoid null-ptr deref in plane-state check vmwgfx: - Avoid null-ptr deref in surface creation -----BEGIN PGP SIGNATURE----- iQEzBAABCgAdFiEEchf7rIzpz2NEoWjlaA3BHVMLeiMFAmc2B4cACgkQaA3BHVML eiOr8AgAkOl/ck2X5Q3f5Fvwbj8sQikNqHUPCBGYf2UUwSstxpbvAxtqBKUV5CyT 0yjaamvFattTTOopzBbEnOpl3cHbwDubmEuaRap1UGdLFE0hFeMA4ycmshKppn14 GbL7x3FSL3GaU3AHNzWowLtLE99OaZM7YLJ/BL3auHDyezg0TXD++OLvjezZhG3x oiO4kNDImOKGSF5nFmTfZkQARHCcm6xv3I+E248RZGeL6HToeZc2lLbENuTCS6la RQBH+OYnLXQDRznoq2Qj5SN2H4jl1nlnJdZZ8Ph592jCpieg3OjO3eGTNsT7SF5L tYZZFIcTSf76YgpkfBPbPWPlRS1KzQ== =qfvW -----END PGP SIGNATURE----- Merge tag 'drm-misc-fixes-2024-11-14' of https://gitlab.freedesktop.org/drm/misc/kernel into drm-fixes Short summary of fixes pull: bridge: - tc358768: Fix DSI command tx nouveau: - Fix GSP AUX error handling - dp: Handle retires for AUX CH transfers with GSP - fw: Sync DMA after setup panthor: - Fix partial BO mappings to GPU rockchip: - vop: Avoid null-ptr deref in plane-state check vmwgfx: - Avoid null-ptr deref in surface creation Signed-off-by: Dave Airlie <airlied@redhat.com> From: Thomas Zimmermann <tzimmermann@suse.de> Link: https://patchwork.freedesktop.org/patch/msgid/20241114142256.GA86810@2a02-2454-fd5e-fd00-4ce-489-4b34-bd1a.dyn6.pyur.net	2024-11-15 06:40:39 +10:00
Everest K.C.	6d9f9115c0	drm/xe/guc: Fix dereference before NULL check The pointer list->list is dereferenced before the NULL check. Fix this by moving the NULL check outside the for loop, so that the check is performed before the dereferencing. The list->list pointer cannot be NULL so this has no effect on runtime. It's just a correctness issue. This issue was reported by Coverity Scan. https://scan7.scan.coverity.com/#/project-view/51525/11354?selectedIssue=1600335 Fixes: 0f1fdf559225 ("drm/xe/guc: Save manual engine capture into capture list") Signed-off-by: Everest K.C. <everestkc@everestkc.com.np> Reviewed-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241023233356.5479-1-everestkc@everestkc.com.np (cherry picked from commit 2aff81e039de5b0b7ef6bdcb2c320f121f69e2b4) Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>	2024-11-14 14:55:01 +01:00
Francesco Dolcini	32c4514455	drm/bridge: tc358768: Fix DSI command tx Wait for the command transmission to be completed in the DSI transfer function polling for the dc_start bit to go back to idle state after the transmission is started. This is documented in the datasheet and failures to do so lead to commands corruption. Fixes: ff1ca6397b1d ("drm/bridge: Add tc358768 driver") Cc: stable@vger.kernel.org Signed-off-by: Francesco Dolcini <francesco.dolcini@toradex.com> Reviewed-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://lore.kernel.org/r/20240926141246.48282-1-francesco@dolcini.it Signed-off-by: Neil Armstrong <neil.armstrong@linaro.org> Link: https://patchwork.freedesktop.org/patch/msgid/20240926141246.48282-1-francesco@dolcini.it	2024-11-14 11:29:42 +01:00
Chen Ridong	93d1f41a82	drm/vmwgfx: avoid null_ptr_deref in vmw_framebuffer_surface_create_handle The 'vmw_user_object_buffer' function may return NULL with incorrect inputs. To avoid possible null pointer dereference, add a check whether the 'bo' is NULL in the vmw_framebuffer_surface_create_handle. Fixes: d6667f0ddf46 ("drm/vmwgfx: Fix handling of dumb buffers") Signed-off-by: Chen Ridong <chenridong@huawei.com> Signed-off-by: Zack Rusin <zack.rusin@broadcom.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241029083429.1185479-1-chenridong@huaweicloud.com	2024-11-14 02:13:22 -05:00
Dave Airlie	9776c0a75a	nouveau/dp: handle retries for AUX CH transfers with GSP. eb284f4b3781 drm/nouveau/dp: Honor GSP link training retry timeouts tried to fix a problem with panel retires, however it appears the auxch also needs the same treatment, so add the same retry wrapper around it. This fixes some eDP panels after a suspend/resume cycle. Fixes: eb284f4b3781 ("drm/nouveau/dp: Honor GSP link training retry timeouts") Cc: stable@vger.kernel.org Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241111034126.2028401-2-airlied@gmail.com	2024-11-14 11:50:01 +10:00
Dave Airlie	b6ad7debf5	nouveau: handle EBUSY and EAGAIN for GSP aux errors. The upper layer transfer functions expect EBUSY as a return for when retries should be done. Fix the AUX error translation, but also check for both errors in a few places. Fixes: eb284f4b3781 ("drm/nouveau/dp: Honor GSP link training retry timeouts") Cc: stable@vger.kernel.org Reviewed-by: Lyude Paul <lyude@redhat.com> Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241111034126.2028401-1-airlied@gmail.com	2024-11-14 11:50:01 +10:00
Dave Airlie	21ec425eaf	nouveau: fw: sync dma after setup is called. When this code moved to non-coherent allocator the sync was put too early for some firmwares which called the setup function, move the sync down after the setup function. Reported-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Tested-by: Diogo Ivo <diogo.ivo@tecnico.ulisboa.pt> Reviewed-by: Lyude Paul <lyude@redhat.com> Fixes: 9b340aeb26d5 ("nouveau/firmware: use dma non-coherent allocator") Cc: stable@vger.kernel.org Signed-off-by: Dave Airlie <airlied@redhat.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241114004603.3095485-1-airlied@gmail.com	2024-11-14 11:50:01 +10:00
Ashutosh Dixit	c0403e4cee	drm/xe/oa: Fix "Missing outer runtime PM protection" warning Fix the following drm_WARN: [953.586396] xe 0000:00:02.0: [drm] Missing outer runtime PM protection ... <4> [953.587090] ? xe_pm_runtime_get_noresume+0x8d/0xa0 [xe] <4> [953.587208] guc_exec_queue_add_msg+0x28/0x130 [xe] <4> [953.587319] guc_exec_queue_fini+0x3a/0x40 [xe] <4> [953.587425] xe_exec_queue_destroy+0xb3/0xf0 [xe] <4> [953.587515] xe_oa_release+0x9c/0xc0 [xe] Suggested-by: John Harrison <john.c.harrison@intel.com> Suggested-by: Matthew Brost <matthew.brost@intel.com> Fixes: e936f885f1e9 ("drm/xe/oa/uapi: Expose OA stream fd") Cc: stable@vger.kernel.org Signed-off-by: Ashutosh Dixit <ashutosh.dixit@intel.com> Reviewed-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241109032003.3093811-1-ashutosh.dixit@intel.com (cherry picked from commit b107c63d2953907908fd0cafb0e543b3c3167b75) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-13 11:37:22 -08:00
Arnd Bergmann	1876c788bb	A few more Qualcomm driver updates for v6.13 Make the Adreno driver invoke the SMMU aperture setup firmware function, which is required to allow the GPU to manage per-process page tables in some firmware versions - as an example Rb3Gen2 has no GPU without this. Add X1E Devkit to the list of devices that has functional EFI variable access through the uefisecapp. Flip the "manual slice configuration quirk" in the Qualcomm LLCC driver, as this only applies to a single platform, and introduce support for QCS8300, QCS615, SAR2130P, and SAR1130P. Lastly, add IPQ5424 and IPQ5404 to the Qualcomm socinfo driver. -----BEGIN PGP SIGNATURE----- iQJJBAABCAAzFiEEBd4DzF816k8JZtUlCx85Pw2ZrcUFAmc0G8EVHGFuZGVyc3Nv bkBrZXJuZWwub3JnAAoJEAsfOT8Nma3FNLwP/1g/MKjDI3AT0esqzTvD5AUgpCab Vb20nlK57Qt64k6WdQ8DLVygA7sb9iB5mq/KYSr3k4+S1DdG8qLzRkL0N/uD9Hwv AHEb7pPtuFIFSsV8+o41SPn30lg6AkxkjbGgMUDrT0hvZcBxUzGHRFbs8HM5ShCg i1Sy/eZje+iX8wOF5xW960eMT3e/8FvW145nLi1uYuaCyCh0A/wNpahjtKpA0q6A s3rnhDye6qJQxcd0xk7fYCbc39iBF4D1Jvrog6Mc4TLKdLULnXIPIJxc7NRm8uGw WbhYwNsYmCGkErAorvPtEDacYxi7sFAwMD+ALayBnhnsU/l5Nz44JfZwIqKUg+WJ bsc+kJRT3oIfwfSJO+tBFI80XDCiTb6Wyyn2lemO1L5NQgDMLxFXBo7tadScS2FM /QEYJvZj4SVfG6rpfD/+4ZVw6FjKnkhDTYoB4Q4uqia4GLfZtIl925xJtaNCtQNM nku4sL+HK2HdL1HuZuVE1Uamt62sdzL2iqwctcjoPpIcq+WB8XwPfshg/iO57vao IFVcPUs/enoCU7PtovQWrI+CcVxaZvOOst4M0Pou9DBi3jGLbVdS9/kt7HEFwbD9 qmdydfPZty2lMHvFl5Q6PzfszGf4+78em/dtz5/BUgB2C7Y56427fbbu0g+ADf8w z8Hq5lM++Qh7SwNd =Wqwn -----END PGP SIGNATURE----- gpgsig -----BEGIN PGP SIGNATURE----- iQIzBAABCgAdFiEEiK/NIGsWEZVxh/FrYKtH/8kJUicFAmc05DAACgkQYKtH/8kJ UidXZQ//b9NHOLIaYNBcxSFT5ibK1orcCpWcUQlzkwKRwz8kXMsXW41t3R/Ab1Wk nJhQFwWL28sDtDTK13mdfSM4dnDuoqtjIk4/40LGyivla7m5LCPr775xFYJzfRPc z0gZvabNp4syKC4l5FCFILOxeYE2cyDkzGMxlTg5ztUfF4AsFr3PrJq9wmvhfQo8 GRm6XSsAaq7cshEwYqJRtDfR3i6vzaqbbn54/HHBTpMZnhus2iZk92Q1gEekD2fi UjySAmdOFCjCIPCxxrRMEPupHjT20DqJVK//m/kNCNAw4i4049o1Xu/KjoymPQNa Wg/UjS5Lsne58dFI1Kmmdat9nqrVrd0ZXQd4mpERSJKYAGptO73PNofzFDvtuLYX V3usgZIZJzk/L+amkVIeY/Ot8fPgiRgb2G5SIcWcIBqiUtwNyw3TOnxcswcml8JA gzneVx4LryrRPrjJGTb1LqefwH4Rs48RpKUVDrHN7vxK3r3EKirqxQtTic7kdrJp GbR0OdkfIcLF3k4uSMDSfKlIB7kcldGdJw8ZJY0Qn4yJr7Uss+N4jk6U8eqkW0AC 0LKGXk4ym9t8rFYnfck/toGDtXMFnjNFnY+SYq9F241Ux08l2stme14PG7Mu0tyY dxEbV1FYkxy55LVYgW0hK/Y5+ogGusNfJICZSTen6NDGK38+x4I= =yuqG -----END PGP SIGNATURE----- Merge tag 'qcom-drivers-for-6.13-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux into soc/drivers A few more Qualcomm driver updates for v6.13 Make the Adreno driver invoke the SMMU aperture setup firmware function, which is required to allow the GPU to manage per-process page tables in some firmware versions - as an example Rb3Gen2 has no GPU without this. Add X1E Devkit to the list of devices that has functional EFI variable access through the uefisecapp. Flip the "manual slice configuration quirk" in the Qualcomm LLCC driver, as this only applies to a single platform, and introduce support for QCS8300, QCS615, SAR2130P, and SAR1130P. Lastly, add IPQ5424 and IPQ5404 to the Qualcomm socinfo driver. * tag 'qcom-drivers-for-6.13-2' of https://git.kernel.org/pub/scm/linux/kernel/git/qcom/linux: soc: qcom: ice: Remove the device_link field in qcom_ice drm/msm/adreno: Setup SMMU aparture for per-process page table firmware: qcom: scm: Introduce CP_SMMU_APERTURE_ID soc: qcom: socinfo: add IPQ5424/IPQ5404 SoC ID dt-bindings: arm: qcom,ids: add SoC ID for IPQ5424/IPQ5404 soc: qcom: llcc: Flip the manual slice configuration condition dt-bindings: firmware: qcom,scm: Document sm8750 SCM firmware: qcom: uefisecapp: Allow X1E Devkit devices soc: qcom: llcc: Add LLCC configuration for the QCS8300 platform dt-bindings: cache: qcom,llcc: Document the QCS8300 LLCC soc: qcom: llcc: Add configuration data for QCS615 dt-bindings: cache: qcom,llcc: Document the QCS615 LLCC soc: qcom: llcc: add support for SAR2130P and SAR1130P soc: qcom: llcc: use deciman integers for bit shift values dt-bindings: cache: qcom,llcc: document SAR2130P and SAR1130P Link: https://lore.kernel.org/r/20241113032425.356306-1-andersson@kernel.org Signed-off-by: Arnd Bergmann <arnd@arndb.de>	2024-11-13 18:38:56 +01:00
Matthew Auld	be7eeaba2a	drm/xe: handle flat ccs during hibernation on igpu Starting from LNL, CCS has moved over to flat CCS model where there is now dedicated memory reserved for storing compression state. On platforms like LNL this reserved memory lives inside graphics stolen memory, which is not treated like normal RAM and is therefore skipped by the core kernel when creating the hibernation image. Currently if something was compressed and we enter hibernation all the corresponding CCS state is lost on such HW, resulting in corrupted memory. To fix this evict user buffers from TT -> SYSTEM to ensure we take a snapshot of the raw CCS state when entering hibernation, where upon resuming we can restore the raw CCS state back when next validating the buffer. This has been confirmed to fix display corruption on LNL when coming back from hibernation. Fixes: cbdc52c11c9b ("drm/xe/xe2: Support flat ccs") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3409 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241112162827.116523-2-matthew.auld@intel.com (cherry picked from commit c8b3c6db941299d7cc31bd9befed3518fdebaf68) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-13 07:58:58 -08:00
Matthew Auld	46f1f4b0f3	drm/xe: improve hibernation on igpu The GGTT looks to be stored inside stolen memory on igpu which is not treated as normal RAM. The core kernel skips this memory range when creating the hibernation image, therefore when coming back from hibernation the GGTT programming is lost. This seems to cause issues with broken resume where GuC FW fails to load: [drm] ERROR GT0: load failed: status = 0x400000A0, time = 10ms, freq = 1250MHz (req 1300MHz), done = -1 [drm] ERROR GT0: load failed: status: Reset = 0, BootROM = 0x50, UKernel = 0x00, MIA = 0x00, Auth = 0x01 [drm] ERROR GT0: firmware signature verification failed [drm] ERROR CRITICAL: Xe has declared device 0000:00:02.0 as wedged. Current GGTT users are kernel internal and tracked as pinned, so it should be possible to hook into the existing save/restore logic that we use for dgpu, where the actual evict is skipped but on restore we importantly restore the GGTT programming. This has been confirmed to fix hibernation on at least ADL and MTL, though likely all igpu platforms are affected. This also means we have a hole in our testing, where the existing s4 tests only really test the driver hooks, and don't go as far as actually rebooting and restoring from the hibernation image and in turn powering down RAM (and therefore losing the contents of stolen). v2 (Brost) - Remove extra newline and drop unnecessary parentheses. Fixes: dd08ebf6c352 ("drm/xe: Introduce a new DRM driver for Intel GPUs") Link: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3275 Signed-off-by: Matthew Auld <matthew.auld@intel.com> Cc: Matthew Brost <matthew.brost@intel.com> Cc: <stable@vger.kernel.org> # v6.8+ Reviewed-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Lucas De Marchi <lucas.demarchi@intel.com> Signed-off-by: Matthew Brost <matthew.brost@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241101170156.213490-2-matthew.auld@intel.com (cherry picked from commit f2a6b8e396666d97ada8e8759dfb6a69d8df6380) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-13 07:58:31 -08:00
Matthew Brost	dd886a63d6	drm/xe: Restore system memory GGTT mappings GGTT mappings reside on the device and this state is lost during suspend / d3cold thus this state must be restored resume regardless if the BO is in system memory or VRAM. v2: - Unnecessary parentheses around bo->placements[0] (Checkpatch) Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Matthew Auld <matthew.auld@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241031182257.2949579-1-matthew.brost@intel.com (cherry picked from commit a19d1db9a3fa89fabd7c83544b84f393ee9b851f) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-13 07:58:22 -08:00
Matthew Brost	ce0d697023	drm/xe: Ensure all locks released in exec IOCTL In couple of places the wrong error handling goto was used to release locks. Fix these to ensure all locks dropped on exec IOCTL errors. Cc: Francois Dugast <francois.dugast@intel.com> Fixes: d16ef1a18e39 ("drm/xe/exec: Switch hw engine group execution mode upon job submission") Signed-off-by: Matthew Brost <matthew.brost@intel.com> Reviewed-by: Francois Dugast <francois.dugast@intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241106224944.30130-1-matthew.brost@intel.com (cherry picked from commit 9e7aacd8402b88394e6a83cb242901fde77a1773) Signed-off-by: Lucas De Marchi <lucas.demarchi@intel.com>	2024-11-13 07:49:14 -08:00
Akash Goel	3387e04391	drm/panthor: Fix handling of partial GPU mapping of BOs This commit fixes the bug in the handling of partial mapping of the buffer objects to the GPU, which caused kernel warnings. Panthor didn't correctly handle the case where the partial mapping spanned multiple scatterlists and the mapping offset didn't point to the 1st page of starting scatterlist. The offset variable was not cleared after reaching the starting scatterlist. Following warning messages were seen. WARNING: CPU: 1 PID: 650 at drivers/iommu/io-pgtable-arm.c:659 __arm_lpae_unmap+0x254/0x5a0 <snip> pc : __arm_lpae_unmap+0x254/0x5a0 lr : __arm_lpae_unmap+0x2cc/0x5a0 <snip> Call trace: __arm_lpae_unmap+0x254/0x5a0 __arm_lpae_unmap+0x108/0x5a0 __arm_lpae_unmap+0x108/0x5a0 __arm_lpae_unmap+0x108/0x5a0 arm_lpae_unmap_pages+0x80/0xa0 panthor_vm_unmap_pages+0xac/0x1c8 [panthor] panthor_gpuva_sm_step_unmap+0x4c/0xc8 [panthor] op_unmap_cb.isra.23.constprop.30+0x54/0x80 __drm_gpuvm_sm_unmap+0x184/0x1c8 drm_gpuvm_sm_unmap+0x40/0x60 panthor_vm_exec_op+0xa8/0x120 [panthor] panthor_vm_bind_exec_sync_op+0xc4/0xe8 [panthor] panthor_ioctl_vm_bind+0x10c/0x170 [panthor] drm_ioctl_kernel+0xbc/0x138 drm_ioctl+0x210/0x4b0 __arm64_sys_ioctl+0xb0/0xf8 invoke_syscall+0x4c/0x110 el0_svc_common.constprop.1+0x98/0xf8 do_el0_svc+0x24/0x38 el0_svc+0x34/0xc8 el0t_64_sync_handler+0xa0/0xc8 el0t_64_sync+0x174/0x178 <snip> panthor : [drm] drm_WARN_ON(unmapped_sz != pgsize * pgcount) WARNING: CPU: 1 PID: 650 at drivers/gpu/drm/panthor/panthor_mmu.c:922 panthor_vm_unmap_pages+0x124/0x1c8 [panthor] <snip> pc : panthor_vm_unmap_pages+0x124/0x1c8 [panthor] lr : panthor_vm_unmap_pages+0x124/0x1c8 [panthor] <snip> panthor : [drm] ERROR failed to unmap range ffffa388f000-ffffa3890000 (requested range ffffa388c000-ffffa3890000) Fixes: 647810ec2476 ("drm/panthor: Add the MMU/VM logical block") Signed-off-by: Akash Goel <akash.goel@arm.com> Reviewed-by: Liviu Dudau <liviu.dudau@arm.com> Reviewed-by: Steven Price <steven.price@arm.com> Reviewed-by: Boris Brezillon <boris.brezillon@collabora.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241111134720.780403-1-akash.goel@arm.com Signed-off-by: Liviu Dudau <liviu.dudau@arm.com>	2024-11-13 00:30:37 +00:00
Vijendar Mukunda	7013a8268d	drm/amd: Fix initialization mistake for NBIO 7.7.0 There is a strapping issue on NBIO 7.7.0 that can lead to spurious PME events while in the D0 state. Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20241112161142.28974-1-mario.limonciello@amd.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 447a54a0f79c9a409ceaa17804bdd2e0206397b9) Cc: stable@vger.kernel.org	2024-11-12 17:37:39 -05:00
Alex Deucher	5f77ee21eb	Revert "drm/amd/display: parse umc_info or vram_info based on ASIC" This reverts commit 694c79769cb384bca8b1ec1d1e84156e726bd106. This was not the root cause. Revert. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: aurabindo.pillai@amd.com Cc: hamishclaxton@gmail.com (cherry picked from commit 3c2296b1eec55b50c64509ba15406142d4a958dc) Cc: stable@vger.kernel.org # 6.11.x	2024-11-12 17:37:39 -05:00
Hamish Claxton	4bb2f52ac0	drm/amd/display: Fix failure to read vram info due to static BP_RESULT The static declaration causes the check to fail. Remove it. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Fixes: 00c391102abc ("drm/amd/display: Add misc DC changes for DCN401") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Hamish Claxton <hamishclaxton@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: aurabindo.pillai@amd.com Cc: hamishclaxton@gmail.com (cherry picked from commit 91314e7dfd83345b8b820b782b2511c9c32866cd) Cc: stable@vger.kernel.org # 6.11.x	2024-11-12 17:37:38 -05:00
Christian König	5a67c31669	drm/amdgpu: enable GTT fallback handling for dGPUs only That is just a waste of time on APUs. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3704 Fixes: 216c1282dde3 ("drm/amdgpu: use GTT only as fallback for VRAM\|GTT") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit e8fc090d322346e5ce4c4cfe03a8100e31f61c3c) Cc: stable@vger.kernel.org	2024-11-12 17:37:38 -05:00
Vijendar Mukunda	447a54a0f7	drm/amd: Fix initialization mistake for NBIO 7.7.0 There is a strapping issue on NBIO 7.7.0 that can lead to spurious PME events while in the D0 state. Co-developed-by: Mario Limonciello <mario.limonciello@amd.com> Signed-off-by: Vijendar Mukunda <Vijendar.Mukunda@amd.com> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Link: https://lore.kernel.org/r/20241112161142.28974-1-mario.limonciello@amd.com Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-12 17:10:40 -05:00
Alex Deucher	3c2296b1ee	Revert "drm/amd/display: parse umc_info or vram_info based on ASIC" This reverts commit 2551b4a321a68134360b860113dd460133e856e5. This was not the root cause. Revert. Link: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: aurabindo.pillai@amd.com Cc: hamishclaxton@gmail.com	2024-11-12 17:10:40 -05:00
Hamish Claxton	91314e7dfd	drm/amd/display: Fix failure to read vram info due to static BP_RESULT The static declaration causes the check to fail. Remove it. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3678 Fixes: 00c391102abc ("drm/amd/display: Add misc DC changes for DCN401") Reviewed-by: Harry Wentland <harry.wentland@amd.com> Signed-off-by: Hamish Claxton <hamishclaxton@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> Cc: aurabindo.pillai@amd.com Cc: hamishclaxton@gmail.com	2024-11-12 17:10:40 -05:00
Christian König	e8fc090d32	drm/amdgpu: enable GTT fallback handling for dGPUs only That is just a waste of time on APUs. Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3704 Fixes: 216c1282dde3 ("drm/amdgpu: use GTT only as fallback for VRAM\|GTT") Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Christian König <christian.koenig@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-12 17:10:05 -05:00
Shaoyun Liu	8521e3c5f0	drm/amd/amdgpu: limit single process inside MES This is for MES to limit only one process for the user queues Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-12 17:02:04 -05:00
Ville Syrjälä	67e023b93d	drm/i915: Grab intel_display from the encoder to avoid potential oopsies Grab the intel_display from 'encoder' rather than 'state' in the encoder hooks to avoid the massive footgun that is intel_sanitize_encoder(), which passes NULL as the 'state' argument to encoder .disable() and .post_disable(). TODO: figure out how to actually fix intel_sanitize_encoder()... Fixes: ab0b0eb5c85c ("drm/i915/tv: convert to struct intel_display") Signed-off-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241107161123.16269-2-ville.syrjala@linux.intel.com Reviewed-by: Jani Nikula <jani.nikula@intel.com> (cherry picked from commit dc3806d9eb66d0105f8d55d462d4ef681d9eac59) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2024-11-12 11:08:06 +02:00
Daniele Ceraolo Spurio	db0fc586ed	drm/i915/gsc: ARL-H and ARL-U need a newer GSC FW. All MTL and ARL SKUs share the same GSC FW, but the newer platforms are only supported in newer blobs. In particular, ARL-S is supported starting from 102.0.10.1878 (which is already the minimum required version for ARL in the code), while ARL-H and ARL-U are supported from 102.1.15.1926. Therefore, the driver needs to check which specific ARL subplatform its running on when verifying that the GSC FW is new enough for it. Fixes: 2955ae8186c8 ("drm/i915: ARL requires a newer GSC firmware") Signed-off-by: Daniele Ceraolo Spurio <daniele.ceraolospurio@intel.com> Cc: John Harrison <John.C.Harrison@Intel.com> Cc: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: Rodrigo Vivi <rodrigo.vivi@intel.com> Reviewed-by: John Harrison <John.C.Harrison@Intel.com> Link: https://patchwork.freedesktop.org/patch/msgid/20241028233132.149745-1-daniele.ceraolospurio@intel.com (cherry picked from commit 3c1d5ced18db8a67251c8436cf9bdc061f972bdb) Signed-off-by: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>	2024-11-12 09:44:55 +02:00
Jack Xiao	79365ea707	drm/amdgpu/mes12: correct kiq unmap latency Correct kiq unmap queue timeout value. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit cfe98204a06329b6b7fce1b828b7d620473181ff) Cc: stable@vger.kernel.org # 6.11.x	2024-11-11 14:05:51 -05:00
Christian König	0e5ac88fb9	drm/amdgpu: fix check in gmc_v9_0_get_vm_pte() The coherency flags can only be determined when the BO is locked and that in turn is only guaranteed when the mapping is validated. Fix the check, move the resource check into the function and add an assert that the BO is locked. Signed-off-by: Christian König <christian.koenig@amd.com> Fixes: d1a372af1c3d ("drm/amdgpu: Set MTYPE in PTE based on BO flags") Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 1b4ca8546f5b5c482717bedb8e031227b1541539) Cc: stable@vger.kernel.org	2024-11-11 14:05:44 -05:00
Tim Huang	df0279e2a1	drm/amd/pm: print pp_dpm_mclk in ascending order on SMU v14.0.0 Currently, the pp_dpm_mclk values are reported in descending order on SMU IP v14.0.0/1/4. Adjust to ascending order for consistency with other clock interfaces. Signed-off-by: Tim Huang <tim.huang@amd.com> Reviewed-by: Yifan Zhang <yifan1.zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit d4be16ccfd5bf822176740a51ff2306679a2247e) Cc: stable@vger.kernel.org	2024-11-11 14:05:39 -05:00
David Rosca	d641a151fc	drm/amdgpu: Fix video caps for H264 and HEVC encode maximum size H264 supports 4096x4096 starting from Polaris. HEVC also supports 4096x4096, with VCN 3 and newer 8192x4352 is supported. Signed-off-by: David Rosca <david.rosca@amd.com> Reviewed-by: Leo Liu <leo.liu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 69e9a9e65b1ea542d07e3fdd4222b46e9f5a3a29) Cc: stable@vger.kernel.org	2024-11-11 14:05:36 -05:00
Rodrigo Siqueira	16dd2825c2	drm/amd/display: Adjust VSDB parser for replay feature At some point, the IEEE ID identification for the replay check in the AMD EDID was added. However, this check causes the following out-of-bounds issues when using KASAN: [ 27.804016] BUG: KASAN: slab-out-of-bounds in amdgpu_dm_update_freesync_caps+0xefa/0x17a0 [amdgpu] [ 27.804788] Read of size 1 at addr ffff8881647fdb00 by task systemd-udevd/383 ... [ 27.821207] Memory state around the buggy address: [ 27.821215] ffff8881647fda00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 27.821224] ffff8881647fda80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 27.821234] >ffff8881647fdb00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 27.821243] ^ [ 27.821250] ffff8881647fdb80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 27.821259] ffff8881647fdc00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 27.821268] ================================================================== This is caused because the ID extraction happens outside of the range of the edid lenght. This commit addresses this issue by considering the amd_vsdb_block size. Cc: ChiaHsuan Chung <chiahsuan.chung@amd.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Rodrigo Siqueira <Rodrigo.Siqueira@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit b7e381b1ccd5e778e3d9c44c669ad38439a861d8) Cc: stable@vger.kernel.org	2024-11-11 14:05:30 -05:00
Dillon Varone	9fc0cbcb6e	drm/amd/display: Require minimum VBlank size for stutter optimization If the nominal VBlank is too small, optimizing for stutter can cause the prefetch bandwidth to increase drasticaly, resulting in higher clock and power requirements. Only optimize if it is >3x the stutter latency. Reviewed-by: Austin Zheng <austin.zheng@amd.com> Signed-off-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit 003215f962cdf2265f126a3f4c9ad20917f87fca) Cc: stable@vger.kernel.org	2024-11-11 14:05:26 -05:00
Ryan Seto	6825cb07b7	drm/amd/display: Handle dml allocation failure to avoid crash [Why] In the case where a dml allocation fails for any reason, the current state's dml contexts would no longer be valid. Then subsequent calls dc_state_copy_internal would shallow copy invalid memory and if the new state was released, a double free would occur. [How] Reset dml pointers in new_state to NULL and avoid invalid pointer Reviewed-by: Dillon Varone <dillon.varone@amd.com> Signed-off-by: Ryan Seto <ryanseto@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit bcafdc61529a48f6f06355d78eb41b3aeda5296c) Cc: stable@vger.kernel.org	2024-11-11 14:05:22 -05:00
Tom Chung	bd8a957661	drm/amd/display: Fix Panel Replay not update screen correctly [Why] In certain use case such as KDE login screen, there will be no atomic commit while do the frame update. If the Panel Replay enabled, it will cause the screen not updated and looks like system hang. [How] Delay few atomic commits before enabled the Panel Replay just like PSR. Fixes: be64336307a6c ("drm/amd/display: Re-enable panel replay feature") Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3686 Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3682 Tested-By: Corey Hickey <bugfood-c@fatooh.org> Tested-By: James Courtier-Dutton <james.dutton@gmail.com> Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit ca628f0eddd73adfccfcc06b2a55d915bca4a342) Cc: stable@vger.kernel.org # 6.11+	2024-11-11 14:05:15 -05:00
Tom Chung	b8d9d5fef4	drm/amd/display: Change some variable name of psr Panel Replay feature may also use the same variable with PSR. Change the variable name and make it not specify for PSR. Reviewed-by: Leo Li <sunpeng.li@amd.com> Signed-off-by: Tom Chung <chiahsuan.chung@amd.com> Signed-off-by: Hamza Mahfooz <hamza.mahfooz@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> (cherry picked from commit c7fafb7a46b38a11a19342d153f505749bf56f3e) Cc: stable@vger.kernel.org # 6.11+	2024-11-11 14:05:11 -05:00
Bjorn Andersson	98e5b7f983	drm/msm/adreno: Setup SMMU aparture for per-process page table Support for per-process page tables requires the SMMU aparture to be setup such that the GPU can make updates with the SMMU. On some targets this is done statically in firmware, on others it's expected to be requested in runtime by the driver, through a SCM call. One place where configuration is expected to be done dynamically is the QCS6490 rb3gen2. The downstream driver does this unconditioanlly on any A6xx and newer, so follow suite and make the call. Signed-off-by: Bjorn Andersson <bjorn.andersson@oss.qualcomm.com> Reviewed-by: Rob Clark <robdclark@gmail.com> Link: https://lore.kernel.org/r/20241110-adreno-smmu-aparture-v2-2-9b1fb2ee41d4@oss.qualcomm.com Signed-off-by: Bjorn Andersson <andersson@kernel.org>	2024-11-11 12:03:27 -06:00
Jack Xiao	cfe98204a0	drm/amdgpu/mes12: correct kiq unmap latency Correct kiq unmap queue timeout value. Signed-off-by: Jack Xiao <Jack.Xiao@amd.com> Reviewed-by: Hawking Zhang <Hawking.Zhang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-11 12:22:58 -05:00
Stanley.Yang	5bea9bbb45	drm/amdgpu: Support vcn and jpeg error info parsing Add vcn and jpeg error count parsing. Signed-off-by: Stanley.Yang <Stanley.Yang@amd.com> Reviewed-by: Yang Wang <kevinyang.wang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-11 12:22:58 -05:00
Shaoyun Liu	ce4971388c	drm/amd : Update MES API header file for v11 & v12 New features require the new fields defines Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com> Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-11 12:22:58 -05:00
Shaoyun Liu	af5661c7c7	drm/amd/amdkfd: add/remove kfd queues on start/stop KFD scheduling Add back kfd queues in start scheduling that originally been removed on stop scheduling. Signed-off-by: Shaoyun Liu <shaoyun.liu@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Reviewed-by: Philip Yang <Philip.Yang@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-11 12:22:39 -05:00
Xiaogang Chen	6cb6d437b5	drm/amdkfd: change kfd process kref count at creation kfd process kref count(process->ref) is initialized to 1 by kref_init. After it is created not need to increase its kref. Instad add kfd process kref at kfd process mmu notifier allocation since we already decrease the kref at free_notifier of mmu_notifier_ops, so pair them. When user process opens kfd node multiple times the kfd process kref is increased each time to balance with kfd node close operation. Signed-off-by: Xiaogang Chen <xiaogang.chen@amd.com> Reviewed-by: Felix Kuehling <felix.kuehling@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-11 11:56:08 -05:00
Advait Dhamorikar	408d208127	drm/amdgpu: Cleanup shift coding style Improves the coding style by updating bit-shift operations in the amdgpu_jpeg.c driver file. It ensures consistency and avoids potential issues by explicitly using 1U and 1ULL for unsigned and unsigned long long shifts in all relevant instances. Signed-off-by: Advait Dhamorikar <advaitdhamorikar@gmail.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-11 11:56:03 -05:00
shaoyunl	92fd1714ee	drm/amd/amdgpu: Increase MES log buffer to dump mes scratch data MES internal scratch data is useful for mes debug, it can only located in VRAM, change the allocation type and increase size for mes 11 Signed-off-by: shaoyunl <shaoyun.liu@amd.com> Acked-by: Feifei Xu <Feifei.Xu@amd.com> Acked-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-11 11:55:49 -05:00
Victor Skvortsov	84a2947ecc	drm/amdgpu: Implement virt req_ras_err_count Enable RAS late init if VF RAS Telemetry is supported. When enabled, the VF can use this interface to query total RAS error counts from the host. The VF FB access may abruptly end due to a fatal error, therefore the VF must cache and sanitize the input. The Host allows 15 Telemetry messages every 60 seconds, afterwhich the host will ignore any more in-coming telemetry messages. The VF will rate limit its msg calling to once every 5 seconds (12 times in 60 seconds). While the VF is rate limited, it will continue to report the last good cached data. v2: Flip generate report & update statistics order for VF Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Acked-by: Tao Zhou <tao.zhou1@amd.com> Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-11 11:55:42 -05:00
Victor Skvortsov	907fec2dfd	drm/amdgpu: VF Query RAS Caps from Host if supported If VF RAS Capability support is enabled, guest is able to retrieve the real RAS support from the host. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-11 11:55:36 -05:00
Victor Skvortsov	9928509dfc	drm/amdgpu: Add msg handlers for SRIOV RAS Telemetry Add message handlers for RAS telemetry. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-11 11:55:08 -05:00
Victor Skvortsov	60c58d72af	drm/amdgpu: Update SRIOV Exchange Headers for RAS Telemetry Support The SRIOV PF/VF Data exchange is extended by 64KB for VF RAS Telemetry data. Add Host RAS Telemetry enable capabilities bitfields. Add a new VF msg REQ_RAS_ERROR_COUNT, the host response data will be populated in the RAS Telemetry region. Signed-off-by: Victor Skvortsov <victor.skvortsov@amd.com> Reviewed-by: Zhigang Luo <zhigang.luo@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com>	2024-11-11 11:55:01 -05:00

... 3 4 5 6 7 ...

110236 Commits