- Add a new start parameter in trim function to specify exact
address from where to start the trimming. This would help us
in situations like if drivers would like to do address alignment
for specific requirements.
- Add a new flag DRM_BUDDY_TRIM_DISABLE. Drivers can use this
flag to disable the allocator trimming part. This patch enables
the drivers control trimming and they can do it themselves
based on the application requirements.
v1:(Matthew)
- check new_start alignment with min chunk_size
- use range_overflows()
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Alex Deucher <alexander.deucher@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Signed-off-by: Alex Deucher <alexander.deucher@amd.com>
(cherry picked from commit db65eb46de135338d6177f8853e0fd208f19d63e)
The drm_buddy minimum page-size requirements should be distinct from the
CPU PAGE_SIZE. Only restriction is that the minimum page-size is at
least 4K.
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Cc: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Reviewed-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Link: https://patchwork.freedesktop.org/patch/msgid/20240229105112.250077-3-matthew.auld@intel.com
Signed-off-by: Christian König <christian.koenig@amd.com>
- Add tracking clear page feature.
- Driver should enable the DRM_BUDDY_CLEARED flag if it
successfully clears the blocks in the free path. On the otherhand,
DRM buddy marks each block as cleared.
- Track the available cleared pages size
- If driver requests cleared memory we prefer cleared memory
but fallback to uncleared if we can't find the cleared blocks.
when driver requests uncleared memory we try to use uncleared but
fallback to cleared memory if necessary.
- When a block gets freed we clear it and mark the freed block as cleared,
when there are buddies which are cleared as well we can merge them.
Otherwise, we prefer to keep the blocks as separated.
- Add a function to support defragmentation.
v1:
- Depends on the flag check DRM_BUDDY_CLEARED, enable the block as
cleared. Else, reset the clear flag for each block in the list(Christian)
- For merging the 2 cleared blocks compare as below,
drm_buddy_is_clear(block) != drm_buddy_is_clear(buddy)(Christian)
- Defragment the memory beginning from min_order
till the required memory space is available.
v2: (Matthew)
- Add a wrapper drm_buddy_free_list_internal for the freeing of blocks
operation within drm buddy.
- Write a macro block_incompatible() to allocate the required blocks.
- Update the xe driver for the drm_buddy_free_list change in arguments.
- add a warning if the two blocks are incompatible on
defragmentation
- call full defragmentation in the fini() function
- place a condition to test if min_order is equal to 0
- replace the list with safe_reverse() variant as we might
remove the block from the list.
v3:
- fix Gitlab user reported lockup issue.
- Keep DRM_BUDDY_HEADER_CLEAR define sorted(Matthew)
- modify to pass the root order instead max_order in fini()
function(Matthew)
- change bool 1 to true(Matthew)
- add check if min_block_size is power of 2(Matthew)
- modify the min_block_size datatype to u64(Matthew)
v4:
- rename the function drm_buddy_defrag with __force_merge.
- Include __force_merge directly in drm buddy file and remove
the defrag use in amdgpu driver.
- Remove list_empty() check(Matthew)
- Remove unnecessary space, headers and placement of new variables(Matthew)
- Add a unit test case(Matthew)
v5:
- remove force merge support to actual range allocation and not to bail
out when contains && split(Matthew)
- add range support to force merge function.
v6:
- modify the alloc_range() function clear page non merged blocks
allocation(Matthew)
- correct the list_insert function name(Matthew).
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Signed-off-by: Matthew Auld <matthew.auld@intel.com>
Suggested-by: Christian König <christian.koenig@amd.com>
Suggested-by: Matthew Auld <matthew.auld@intel.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20240419063538.11957-1-Arunpravin.PaneerSelvam@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
Problem statement: The current method roundup_power_of_two()
to allocate contiguous address triggers -ENOSPC in some cases
even though we have enough free spaces and so to help with
that we introduce a try harder mechanism.
In case of -ENOSPC, the new try harder mechanism rounddown the
original size to power of 2 and iterating over the round down
sized freelist blocks to allocate the required size traversing
RHS and LHS.
As part of the above new method implementation we moved
contiguous/alignment size computation part and trim function
to the drm buddy file.
v2: Modify the alloc_range() function to return total allocated size
on -ENOSPC err and traverse RHS/LHS to allocate the required
size (Matthew).
Signed-off-by: Arunpravin Paneer Selvam <Arunpravin.PaneerSelvam@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20230909160902.15644-1-Arunpravin.PaneerSelvam@amd.com
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
- add a test to check the range allocation
- export get_buddy() function in drm_buddy.c
- export drm_prandom_u32_max_state() in lib/drm_random.c
- include helper functions
- include prime number header file
v2:
- add drm_get_buddy() function description (Matthew Auld)
- removed unnecessary test succeeded print
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220222174845.2175-3-Arunpravin.PaneerSelvam@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
On contiguous allocation, we round up the size
to the *next* power of 2, implement a function
to free the unused pages after the newly allocate block.
v2(Matthew Auld):
- replace function name 'drm_buddy_free_unused_pages' with
drm_buddy_block_trim
- replace input argument name 'actual_size' with 'new_size'
- add more validation checks for input arguments
- add overlaps check to avoid needless searching and splitting
- merged the below patch to see the feature in action
- add free unused pages support to i915 driver
- lock drm_buddy_block_trim() function as it calls mark_free/mark_split
are all globally visible
v3(Matthew Auld):
- remove trim method error handling as we address the failure case
at drm_buddy_block_trim() function
v4:
- in case of trim, at __alloc_range() split_block failure path
marks the block as free and removes it from the original list,
potentially also freeing it, to overcome this problem, we turn
the drm_buddy_block_trim() input node into a temporary node to
prevent recursively freeing itself, but still retain the
un-splitting/freeing of the other nodes(Matthew Auld)
- modify the drm_buddy_block_trim() function return type
v5(Matthew Auld):
- revert drm_buddy_block_trim() function return type changes in v4
- modify drm_buddy_block_trim() passing argument n_pages to original_size
as n_pages has already been rounded up to the next power-of-two and
passing n_pages results noop
v6:
- fix warnings reported by kernel test robot <lkp@intel.com>
v7:
- modify drm_buddy_block_trim() function doc description
- at drm_buddy_block_trim() handle non-allocated block as
a serious programmer error
- fix a typo
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221164552.2434-3-Arunpravin.PaneerSelvam@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
Implemented a function which walk through the order list,
compares the offset and returns the maximum offset block,
this method is unpredictable in obtaining the high range
address blocks which depends on allocation and deallocation.
for instance, if driver requests address at a low specific
range, allocator traverses from the root block and splits
the larger blocks until it reaches the specific block and
in the process of splitting, lower orders in the freelist
are occupied with low range address blocks and for the
subsequent TOPDOWN memory request we may return the low
range blocks.To overcome this issue, we may go with the
below approach.
The other approach, sorting each order list entries in
ascending order and compares the last entry of each
order list in the freelist and return the max block.
This creates sorting overhead on every drm_buddy_free()
request and split up of larger blocks for a single page
request.
v2:
- Fix alignment issues(Matthew Auld)
- Remove unnecessary list_empty check(Matthew Auld)
- merged the below patch to see the feature in action
- add top-down alloc support to i915 driver
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221164552.2434-2-Arunpravin.PaneerSelvam@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>
- Make drm_buddy_alloc a single function to handle
range allocation and non-range allocation demands
- Implemented a new function alloc_range() which allocates
the requested power-of-two block comply with range limitations
- Moved order computation and memory alignment logic from
i915 driver to drm buddy
v2:
merged below changes to keep the build unbroken
- drm_buddy_alloc_range() becomes obsolete and may be removed
- enable ttm range allocation (fpfn / lpfn) support in i915 driver
- apply enhanced drm_buddy_alloc() function to i915 driver
v3(Matthew Auld):
- Fix alignment issues and remove unnecessary list_empty check
- add more validation checks for input arguments
- make alloc_range() block allocations as bottom-up
- optimize order computation logic
- replace uint64_t with u64, which is preferred in the kernel
v4(Matthew Auld):
- keep drm_buddy_alloc_range() function implementation for generic
actual range allocations
- keep alloc_range() implementation for end bias allocations
v5(Matthew Auld):
- modify drm_buddy_alloc() passing argument place->lpfn to lpfn
as place->lpfn will currently always be zero for i915
v6(Matthew Auld):
- fixup potential uaf - If we are unlucky and can't allocate
enough memory when splitting blocks, where we temporarily
end up with the given block and its buddy on the respective
free list, then we need to ensure we delete both blocks,
and no just the buddy, before potentially freeing them
- fix warnings reported by kernel test robot <lkp@intel.com>
v7(Matthew Auld):
- revert fixup potential uaf
- keep __alloc_range() add node to the list logic same as
drm_buddy_alloc_blocks() by having a temporary list variable
- at drm_buddy_alloc_blocks() keep i915 range_overflows macro
and add a new check for end variable
v8:
- fix warnings reported by kernel test robot <lkp@intel.com>
v9(Matthew Auld):
- remove DRM_BUDDY_RANGE_ALLOCATION flag
- remove unnecessary function description
v10:
- keep DRM_BUDDY_RANGE_ALLOCATION flag as removing the flag
and replacing with (end < size) logic fails amdgpu driver load
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Reviewed-by: Matthew Auld <matthew.auld@intel.com>
Signed-off-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220221164552.2434-1-Arunpravin.PaneerSelvam@amd.com
Move the base i915 buddy allocator code into drm
- Move i915_buddy.h to include/drm
- Move i915_buddy.c to drm root folder
- Rename "i915" string with "drm" string wherever applicable
- Rename "I915" string with "DRM" string wherever applicable
- Fix header file dependencies
- Fix alignment issues
- add Makefile support for drm buddy
- export functions and write kerneldoc description
- Remove i915 selftest config check condition as buddy selftest
will be moved to drm selftest folder
cleanup i915 buddy references in i915 driver module
and replace with drm buddy
v2:
- include header file in alphabetical order(Thomas)
- merged changes listed in the body section into a single patch
to keep the build intact(Christian, Jani)
v3:
- make drm buddy a separate module(Thomas, Christian)
v4:
- Fix build error reported by kernel test robot <lkp@intel.com>
- removed i915 buddy selftest from i915_mock_selftests.h to
avoid build error
- removed selftests/i915_buddy.c file as we create a new set of
buddy test cases in drm/selftests folder
v5:
- Fix merge conflict issue
v6:
- replace drm_buddy_mm structure name as drm_buddy(Thomas, Christian)
- replace drm_buddy_alloc() function name as drm_buddy_alloc_blocks()
(Thomas)
- replace drm_buddy_free() function name as drm_buddy_free_block()
(Thomas)
- export drm_buddy_free_block() function
- fix multiple instances of KMEM_CACHE() entry
v7:
- fix warnings reported by kernel test robot <lkp@intel.com>
- modify the license(Christian)
v8:
- fix warnings reported by kernel test robot <lkp@intel.com>
Signed-off-by: Arunpravin <Arunpravin.PaneerSelvam@amd.com>
Acked-by: Christian König <christian.koenig@amd.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20220118104504.2349-1-Arunpravin.PaneerSelvam@amd.com
Signed-off-by: Christian König <christian.koenig@amd.com>