binder: use bitmap for faster descriptor lookup
When creating new binder references, the driver assigns a descriptor id
that is shared with userspace. Regrettably, the driver needs to keep the
descriptors small enough to accommodate userspace potentially using them
as Vector indexes. Currently, the driver performs a linear search on the
rb-tree of references to find the smallest available descriptor id. This
approach, however, scales poorly as the number of references grows.
This patch introduces the usage of bitmaps to boost the performance of
descriptor assignments. This optimization results in notable performance
gains, particularly in processes with a large number of references. The
following benchmark with 100,000 references showcases the difference in
latency between the dbitmap implementation and the legacy approach:
[ 587.145098] get_ref_desc_olocked: 15us (dbitmap on)
[ 602.788623] get_ref_desc_olocked: 47343us (dbitmap off)
Note the bitmap size is dynamically adjusted in line with the number of
references, ensuring efficient memory usage. In cases where growing the
bitmap is not possible, the driver falls back to the slow legacy method.
A previous attempt to solve this issue was proposed in [1]. However,
such method involved adding new ioctls which isn't great, plus older
userspace code would not have benefited from the optimizations either.
Link: https://lore.kernel.org/all/20240417191418.1341988-1-cmllamas@google.com/ [1]
Cc: Tim Murray <timmurray@google.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Martijn Coenen <maco@android.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: John Stultz <jstultz@google.com>
Cc: Steven Moreland <smoreland@google.com>
Suggested-by: Nick Chen <chenjia3@oppo.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20240612042535.1556708-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12 04:25:13 +00:00
|
|
|
/* SPDX-License-Identifier: GPL-2.0-only */
|
|
|
|
/*
|
|
|
|
* Copyright 2024 Google LLC
|
|
|
|
*
|
|
|
|
* dbitmap - dynamically sized bitmap library.
|
|
|
|
*
|
|
|
|
* Used by the binder driver to optimize the allocation of the smallest
|
|
|
|
* available descriptor ID. Each bit in the bitmap represents the state
|
2024-07-22 15:05:11 +00:00
|
|
|
* of an ID.
|
binder: use bitmap for faster descriptor lookup
When creating new binder references, the driver assigns a descriptor id
that is shared with userspace. Regrettably, the driver needs to keep the
descriptors small enough to accommodate userspace potentially using them
as Vector indexes. Currently, the driver performs a linear search on the
rb-tree of references to find the smallest available descriptor id. This
approach, however, scales poorly as the number of references grows.
This patch introduces the usage of bitmaps to boost the performance of
descriptor assignments. This optimization results in notable performance
gains, particularly in processes with a large number of references. The
following benchmark with 100,000 references showcases the difference in
latency between the dbitmap implementation and the legacy approach:
[ 587.145098] get_ref_desc_olocked: 15us (dbitmap on)
[ 602.788623] get_ref_desc_olocked: 47343us (dbitmap off)
Note the bitmap size is dynamically adjusted in line with the number of
references, ensuring efficient memory usage. In cases where growing the
bitmap is not possible, the driver falls back to the slow legacy method.
A previous attempt to solve this issue was proposed in [1]. However,
such method involved adding new ioctls which isn't great, plus older
userspace code would not have benefited from the optimizations either.
Link: https://lore.kernel.org/all/20240417191418.1341988-1-cmllamas@google.com/ [1]
Cc: Tim Murray <timmurray@google.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Martijn Coenen <maco@android.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: John Stultz <jstultz@google.com>
Cc: Steven Moreland <smoreland@google.com>
Suggested-by: Nick Chen <chenjia3@oppo.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20240612042535.1556708-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12 04:25:13 +00:00
|
|
|
*
|
|
|
|
* A dbitmap can grow or shrink as needed. This part has been designed
|
|
|
|
* considering that users might need to briefly release their locks in
|
|
|
|
* order to allocate memory for the new bitmap. These operations then,
|
|
|
|
* are verified to determine if the grow or shrink is sill valid.
|
|
|
|
*
|
|
|
|
* This library does not provide protection against concurrent access
|
|
|
|
* by itself. Binder uses the proc->outer_lock for this purpose.
|
|
|
|
*/
|
|
|
|
|
|
|
|
#ifndef _LINUX_DBITMAP_H
|
|
|
|
#define _LINUX_DBITMAP_H
|
|
|
|
#include <linux/bitmap.h>
|
|
|
|
|
|
|
|
#define NBITS_MIN BITS_PER_TYPE(unsigned long)
|
|
|
|
|
|
|
|
struct dbitmap {
|
|
|
|
unsigned int nbits;
|
|
|
|
unsigned long *map;
|
|
|
|
};
|
|
|
|
|
|
|
|
static inline int dbitmap_enabled(struct dbitmap *dmap)
|
|
|
|
{
|
|
|
|
return !!dmap->nbits;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void dbitmap_free(struct dbitmap *dmap)
|
|
|
|
{
|
|
|
|
dmap->nbits = 0;
|
|
|
|
kfree(dmap->map);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Returns the nbits that a dbitmap can shrink to, 0 if not possible. */
|
|
|
|
static inline unsigned int dbitmap_shrink_nbits(struct dbitmap *dmap)
|
|
|
|
{
|
|
|
|
unsigned int bit;
|
|
|
|
|
|
|
|
if (dmap->nbits <= NBITS_MIN)
|
|
|
|
return 0;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Determine if the bitmap can shrink based on the position of
|
|
|
|
* its last set bit. If the bit is within the first quarter of
|
|
|
|
* the bitmap then shrinking is possible. In this case, the
|
|
|
|
* bitmap should shrink to half its current size.
|
|
|
|
*/
|
|
|
|
bit = find_last_bit(dmap->map, dmap->nbits);
|
|
|
|
if (bit < (dmap->nbits >> 2))
|
|
|
|
return dmap->nbits >> 1;
|
|
|
|
|
2024-07-22 15:05:11 +00:00
|
|
|
/* find_last_bit() returns dmap->nbits when no bits are set. */
|
binder: use bitmap for faster descriptor lookup
When creating new binder references, the driver assigns a descriptor id
that is shared with userspace. Regrettably, the driver needs to keep the
descriptors small enough to accommodate userspace potentially using them
as Vector indexes. Currently, the driver performs a linear search on the
rb-tree of references to find the smallest available descriptor id. This
approach, however, scales poorly as the number of references grows.
This patch introduces the usage of bitmaps to boost the performance of
descriptor assignments. This optimization results in notable performance
gains, particularly in processes with a large number of references. The
following benchmark with 100,000 references showcases the difference in
latency between the dbitmap implementation and the legacy approach:
[ 587.145098] get_ref_desc_olocked: 15us (dbitmap on)
[ 602.788623] get_ref_desc_olocked: 47343us (dbitmap off)
Note the bitmap size is dynamically adjusted in line with the number of
references, ensuring efficient memory usage. In cases where growing the
bitmap is not possible, the driver falls back to the slow legacy method.
A previous attempt to solve this issue was proposed in [1]. However,
such method involved adding new ioctls which isn't great, plus older
userspace code would not have benefited from the optimizations either.
Link: https://lore.kernel.org/all/20240417191418.1341988-1-cmllamas@google.com/ [1]
Cc: Tim Murray <timmurray@google.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Martijn Coenen <maco@android.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: John Stultz <jstultz@google.com>
Cc: Steven Moreland <smoreland@google.com>
Suggested-by: Nick Chen <chenjia3@oppo.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20240612042535.1556708-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12 04:25:13 +00:00
|
|
|
if (bit == dmap->nbits)
|
|
|
|
return NBITS_MIN;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Replace the internal bitmap with a new one of different size */
|
|
|
|
static inline void
|
|
|
|
dbitmap_replace(struct dbitmap *dmap, unsigned long *new, unsigned int nbits)
|
|
|
|
{
|
|
|
|
bitmap_copy(new, dmap->map, min(dmap->nbits, nbits));
|
|
|
|
kfree(dmap->map);
|
|
|
|
dmap->map = new;
|
|
|
|
dmap->nbits = nbits;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void
|
|
|
|
dbitmap_shrink(struct dbitmap *dmap, unsigned long *new, unsigned int nbits)
|
|
|
|
{
|
|
|
|
if (!new)
|
|
|
|
return;
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Verify that shrinking to @nbits is still possible. The @new
|
|
|
|
* bitmap might have been allocated without locks, so this call
|
|
|
|
* could now be outdated. In this case, free @new and move on.
|
|
|
|
*/
|
|
|
|
if (!dbitmap_enabled(dmap) || dbitmap_shrink_nbits(dmap) != nbits) {
|
|
|
|
kfree(new);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
dbitmap_replace(dmap, new, nbits);
|
|
|
|
}
|
|
|
|
|
|
|
|
/* Returns the nbits that a dbitmap can grow to. */
|
|
|
|
static inline unsigned int dbitmap_grow_nbits(struct dbitmap *dmap)
|
|
|
|
{
|
|
|
|
return dmap->nbits << 1;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void
|
|
|
|
dbitmap_grow(struct dbitmap *dmap, unsigned long *new, unsigned int nbits)
|
|
|
|
{
|
|
|
|
/*
|
|
|
|
* Verify that growing to @nbits is still possible. The @new
|
|
|
|
* bitmap might have been allocated without locks, so this call
|
|
|
|
* could now be outdated. In this case, free @new and move on.
|
|
|
|
*/
|
|
|
|
if (!dbitmap_enabled(dmap) || nbits <= dmap->nbits) {
|
|
|
|
kfree(new);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
|
|
|
* Check for ENOMEM after confirming the grow operation is still
|
|
|
|
* required. This ensures we only disable the dbitmap when it's
|
|
|
|
* necessary. Once the dbitmap is disabled, binder will fallback
|
|
|
|
* to slow_desc_lookup_olocked().
|
|
|
|
*/
|
|
|
|
if (!new) {
|
|
|
|
dbitmap_free(dmap);
|
|
|
|
return;
|
|
|
|
}
|
|
|
|
|
|
|
|
dbitmap_replace(dmap, new, nbits);
|
|
|
|
}
|
|
|
|
|
|
|
|
/*
|
2024-07-22 15:05:11 +00:00
|
|
|
* Finds and sets the next zero bit in the bitmap. Upon success @bit
|
binder: use bitmap for faster descriptor lookup
When creating new binder references, the driver assigns a descriptor id
that is shared with userspace. Regrettably, the driver needs to keep the
descriptors small enough to accommodate userspace potentially using them
as Vector indexes. Currently, the driver performs a linear search on the
rb-tree of references to find the smallest available descriptor id. This
approach, however, scales poorly as the number of references grows.
This patch introduces the usage of bitmaps to boost the performance of
descriptor assignments. This optimization results in notable performance
gains, particularly in processes with a large number of references. The
following benchmark with 100,000 references showcases the difference in
latency between the dbitmap implementation and the legacy approach:
[ 587.145098] get_ref_desc_olocked: 15us (dbitmap on)
[ 602.788623] get_ref_desc_olocked: 47343us (dbitmap off)
Note the bitmap size is dynamically adjusted in line with the number of
references, ensuring efficient memory usage. In cases where growing the
bitmap is not possible, the driver falls back to the slow legacy method.
A previous attempt to solve this issue was proposed in [1]. However,
such method involved adding new ioctls which isn't great, plus older
userspace code would not have benefited from the optimizations either.
Link: https://lore.kernel.org/all/20240417191418.1341988-1-cmllamas@google.com/ [1]
Cc: Tim Murray <timmurray@google.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Martijn Coenen <maco@android.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: John Stultz <jstultz@google.com>
Cc: Steven Moreland <smoreland@google.com>
Suggested-by: Nick Chen <chenjia3@oppo.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20240612042535.1556708-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12 04:25:13 +00:00
|
|
|
* is populated with the index and 0 is returned. Otherwise, -ENOSPC
|
|
|
|
* is returned to indicate that a dbitmap_grow() is needed.
|
|
|
|
*/
|
|
|
|
static inline int
|
2024-07-22 15:05:11 +00:00
|
|
|
dbitmap_acquire_next_zero_bit(struct dbitmap *dmap, unsigned long offset,
|
|
|
|
unsigned long *bit)
|
binder: use bitmap for faster descriptor lookup
When creating new binder references, the driver assigns a descriptor id
that is shared with userspace. Regrettably, the driver needs to keep the
descriptors small enough to accommodate userspace potentially using them
as Vector indexes. Currently, the driver performs a linear search on the
rb-tree of references to find the smallest available descriptor id. This
approach, however, scales poorly as the number of references grows.
This patch introduces the usage of bitmaps to boost the performance of
descriptor assignments. This optimization results in notable performance
gains, particularly in processes with a large number of references. The
following benchmark with 100,000 references showcases the difference in
latency between the dbitmap implementation and the legacy approach:
[ 587.145098] get_ref_desc_olocked: 15us (dbitmap on)
[ 602.788623] get_ref_desc_olocked: 47343us (dbitmap off)
Note the bitmap size is dynamically adjusted in line with the number of
references, ensuring efficient memory usage. In cases where growing the
bitmap is not possible, the driver falls back to the slow legacy method.
A previous attempt to solve this issue was proposed in [1]. However,
such method involved adding new ioctls which isn't great, plus older
userspace code would not have benefited from the optimizations either.
Link: https://lore.kernel.org/all/20240417191418.1341988-1-cmllamas@google.com/ [1]
Cc: Tim Murray <timmurray@google.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Martijn Coenen <maco@android.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: John Stultz <jstultz@google.com>
Cc: Steven Moreland <smoreland@google.com>
Suggested-by: Nick Chen <chenjia3@oppo.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20240612042535.1556708-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12 04:25:13 +00:00
|
|
|
{
|
|
|
|
unsigned long n;
|
|
|
|
|
2024-07-22 15:05:11 +00:00
|
|
|
n = find_next_zero_bit(dmap->map, dmap->nbits, offset);
|
binder: use bitmap for faster descriptor lookup
When creating new binder references, the driver assigns a descriptor id
that is shared with userspace. Regrettably, the driver needs to keep the
descriptors small enough to accommodate userspace potentially using them
as Vector indexes. Currently, the driver performs a linear search on the
rb-tree of references to find the smallest available descriptor id. This
approach, however, scales poorly as the number of references grows.
This patch introduces the usage of bitmaps to boost the performance of
descriptor assignments. This optimization results in notable performance
gains, particularly in processes with a large number of references. The
following benchmark with 100,000 references showcases the difference in
latency between the dbitmap implementation and the legacy approach:
[ 587.145098] get_ref_desc_olocked: 15us (dbitmap on)
[ 602.788623] get_ref_desc_olocked: 47343us (dbitmap off)
Note the bitmap size is dynamically adjusted in line with the number of
references, ensuring efficient memory usage. In cases where growing the
bitmap is not possible, the driver falls back to the slow legacy method.
A previous attempt to solve this issue was proposed in [1]. However,
such method involved adding new ioctls which isn't great, plus older
userspace code would not have benefited from the optimizations either.
Link: https://lore.kernel.org/all/20240417191418.1341988-1-cmllamas@google.com/ [1]
Cc: Tim Murray <timmurray@google.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Martijn Coenen <maco@android.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: John Stultz <jstultz@google.com>
Cc: Steven Moreland <smoreland@google.com>
Suggested-by: Nick Chen <chenjia3@oppo.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20240612042535.1556708-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12 04:25:13 +00:00
|
|
|
if (n == dmap->nbits)
|
|
|
|
return -ENOSPC;
|
|
|
|
|
|
|
|
*bit = n;
|
|
|
|
set_bit(n, dmap->map);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
|
|
|
|
static inline void
|
|
|
|
dbitmap_clear_bit(struct dbitmap *dmap, unsigned long bit)
|
|
|
|
{
|
2024-07-22 15:05:11 +00:00
|
|
|
clear_bit(bit, dmap->map);
|
binder: use bitmap for faster descriptor lookup
When creating new binder references, the driver assigns a descriptor id
that is shared with userspace. Regrettably, the driver needs to keep the
descriptors small enough to accommodate userspace potentially using them
as Vector indexes. Currently, the driver performs a linear search on the
rb-tree of references to find the smallest available descriptor id. This
approach, however, scales poorly as the number of references grows.
This patch introduces the usage of bitmaps to boost the performance of
descriptor assignments. This optimization results in notable performance
gains, particularly in processes with a large number of references. The
following benchmark with 100,000 references showcases the difference in
latency between the dbitmap implementation and the legacy approach:
[ 587.145098] get_ref_desc_olocked: 15us (dbitmap on)
[ 602.788623] get_ref_desc_olocked: 47343us (dbitmap off)
Note the bitmap size is dynamically adjusted in line with the number of
references, ensuring efficient memory usage. In cases where growing the
bitmap is not possible, the driver falls back to the slow legacy method.
A previous attempt to solve this issue was proposed in [1]. However,
such method involved adding new ioctls which isn't great, plus older
userspace code would not have benefited from the optimizations either.
Link: https://lore.kernel.org/all/20240417191418.1341988-1-cmllamas@google.com/ [1]
Cc: Tim Murray <timmurray@google.com>
Cc: Arve Hjønnevåg <arve@android.com>
Cc: Alice Ryhl <aliceryhl@google.com>
Cc: Martijn Coenen <maco@android.com>
Cc: Todd Kjos <tkjos@android.com>
Cc: John Stultz <jstultz@google.com>
Cc: Steven Moreland <smoreland@google.com>
Suggested-by: Nick Chen <chenjia3@oppo.com>
Reviewed-by: Alice Ryhl <aliceryhl@google.com>
Signed-off-by: Carlos Llamas <cmllamas@google.com>
Link: https://lore.kernel.org/r/20240612042535.1556708-1-cmllamas@google.com
Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2024-06-12 04:25:13 +00:00
|
|
|
}
|
|
|
|
|
|
|
|
static inline int dbitmap_init(struct dbitmap *dmap)
|
|
|
|
{
|
|
|
|
dmap->map = bitmap_zalloc(NBITS_MIN, GFP_KERNEL);
|
|
|
|
if (!dmap->map) {
|
|
|
|
dmap->nbits = 0;
|
|
|
|
return -ENOMEM;
|
|
|
|
}
|
|
|
|
|
|
|
|
dmap->nbits = NBITS_MIN;
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
#endif
|