2019-05-02 23:23:29 +03:00
|
|
|
// SPDX-License-Identifier: BSD-3-Clause OR GPL-2.0
|
2021-09-17 14:17:35 +03:00
|
|
|
/* Copyright 2016-2018 NXP
|
2019-05-02 23:23:29 +03:00
|
|
|
* Copyright (c) 2018-2019, Vladimir Oltean <olteanv@gmail.com>
|
|
|
|
*/
|
|
|
|
#include <linux/packing.h>
|
|
|
|
#include <linux/module.h>
|
|
|
|
#include <linux/bitops.h>
|
lib: packing: add pack_fields() and unpack_fields()
This is new API which caters to the following requirements:
- Pack or unpack a large number of fields to/from a buffer with a small
code footprint. The current alternative is to open-code a large number
of calls to pack() and unpack(), or to use packing() to reduce that
number to half. But packing() is not const-correct.
- Use unpacked numbers stored in variables smaller than u64. This
reduces the rodata footprint of the stored field arrays.
- Perform error checking at compile time, rather than runtime, and return
void from the API functions. Because the C preprocessor can't generate
variable length code (loops), this is a bit tricky to do with macros.
To handle this, implement macros which sanity check the packed field
definitions based on their size. Finally, a single macro with a chain of
__builtin_choose_expr() is used to select the appropriate macros. We
enforce the use of ascending or descending order to avoid O(N^2) scaling
when checking for overlap. Note that the macros are written with care to
ensure that the compilers can correctly evaluate the resulting code at
compile time. In particular, care was taken with avoiding too many nested
statement expressions. Nested statement expressions trip up some
compilers, especially when passing down variables created in previous
statement expressions.
There are two key design choices intended to keep the overall macro code
size small. First, the definition of each CHECK_PACKED_FIELDS_N macro is
implemented recursively, by calling the N-1 macro. This avoids needing
the code to repeat multiple times.
Second, the CHECK_PACKED_FIELD macro enforces that the fields in the
array are sorted in order. This allows checking for overlap only with
neighboring fields, rather than the general overlap case where each field
would need to be checked against other fields.
The overlap checks use the first two fields to determine the order of the
remaining fields, thus allowing either ascending or descending order.
This enables drivers the flexibility to keep the fields ordered in which
ever order most naturally fits their hardware design and its associated
documentation.
The CHECK_PACKED_FIELDS macro is directly called from within pack_fields
and unpack_fields, ensuring that all drivers using the API receive the
benefits of the compile-time checks. Users do not need to directly call
any of the macros directly.
The CHECK_PACKED_FIELDS and its helper macros CHECK_PACKED_FIELDS_(0..50)
are generated using a simple C program in scripts/gen_packed_field_checks.c
This program can be compiled on demand and executed to generate the
macro code in include/linux/packing.h. This will aid in the event that a
driver needs more than 50 fields. The generator can be updated with a new
size, and used to update the packing.h header file. In practice, the ice
driver will need to support 27 fields, and the sja1105 driver will need
to support 0 fields. This on-demand generation avoids the need to modify
Kbuild. We do not anticipate the maximum number of fields to grow very
often.
- Reduced rodata footprint for the storage of the packed field arrays.
To that end, we have struct packed_field_u8 and packed_field_u16, which
define the fields with the associated type. More can be added as
needed (unlikely for now). On these types, the same generic pack_fields()
and unpack_fields() API can be used, thanks to the new C11 _Generic()
selection feature, which can call pack_fields_u8() or pack_fields_16(),
depending on the type of the "fields" array - a simplistic form of
polymorphism. It is evaluated at compile time which function will actually
be called.
Over time, packing() is expected to be completely replaced either with
pack() or with pack_fields().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241210-packing-pack-fields-and-ice-implementation-v10-3-ee56a47479ac@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-10 12:27:12 -08:00
|
|
|
#include <linux/bits.h>
|
2019-05-02 23:23:29 +03:00
|
|
|
#include <linux/errno.h>
|
|
|
|
#include <linux/types.h>
|
2022-12-10 03:44:23 +03:00
|
|
|
#include <linux/bitrev.h>
|
2019-05-02 23:23:29 +03:00
|
|
|
|
lib: packing: add pack_fields() and unpack_fields()
This is new API which caters to the following requirements:
- Pack or unpack a large number of fields to/from a buffer with a small
code footprint. The current alternative is to open-code a large number
of calls to pack() and unpack(), or to use packing() to reduce that
number to half. But packing() is not const-correct.
- Use unpacked numbers stored in variables smaller than u64. This
reduces the rodata footprint of the stored field arrays.
- Perform error checking at compile time, rather than runtime, and return
void from the API functions. Because the C preprocessor can't generate
variable length code (loops), this is a bit tricky to do with macros.
To handle this, implement macros which sanity check the packed field
definitions based on their size. Finally, a single macro with a chain of
__builtin_choose_expr() is used to select the appropriate macros. We
enforce the use of ascending or descending order to avoid O(N^2) scaling
when checking for overlap. Note that the macros are written with care to
ensure that the compilers can correctly evaluate the resulting code at
compile time. In particular, care was taken with avoiding too many nested
statement expressions. Nested statement expressions trip up some
compilers, especially when passing down variables created in previous
statement expressions.
There are two key design choices intended to keep the overall macro code
size small. First, the definition of each CHECK_PACKED_FIELDS_N macro is
implemented recursively, by calling the N-1 macro. This avoids needing
the code to repeat multiple times.
Second, the CHECK_PACKED_FIELD macro enforces that the fields in the
array are sorted in order. This allows checking for overlap only with
neighboring fields, rather than the general overlap case where each field
would need to be checked against other fields.
The overlap checks use the first two fields to determine the order of the
remaining fields, thus allowing either ascending or descending order.
This enables drivers the flexibility to keep the fields ordered in which
ever order most naturally fits their hardware design and its associated
documentation.
The CHECK_PACKED_FIELDS macro is directly called from within pack_fields
and unpack_fields, ensuring that all drivers using the API receive the
benefits of the compile-time checks. Users do not need to directly call
any of the macros directly.
The CHECK_PACKED_FIELDS and its helper macros CHECK_PACKED_FIELDS_(0..50)
are generated using a simple C program in scripts/gen_packed_field_checks.c
This program can be compiled on demand and executed to generate the
macro code in include/linux/packing.h. This will aid in the event that a
driver needs more than 50 fields. The generator can be updated with a new
size, and used to update the packing.h header file. In practice, the ice
driver will need to support 27 fields, and the sja1105 driver will need
to support 0 fields. This on-demand generation avoids the need to modify
Kbuild. We do not anticipate the maximum number of fields to grow very
often.
- Reduced rodata footprint for the storage of the packed field arrays.
To that end, we have struct packed_field_u8 and packed_field_u16, which
define the fields with the associated type. More can be added as
needed (unlikely for now). On these types, the same generic pack_fields()
and unpack_fields() API can be used, thanks to the new C11 _Generic()
selection feature, which can call pack_fields_u8() or pack_fields_16(),
depending on the type of the "fields" array - a simplistic form of
polymorphism. It is evaluated at compile time which function will actually
be called.
Over time, packing() is expected to be completely replaced either with
pack() or with pack_fields().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241210-packing-pack-fields-and-ice-implementation-v10-3-ee56a47479ac@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-10 12:27:12 -08:00
|
|
|
#define __pack_fields(pbuf, pbuflen, ustruct, fields, num_fields, quirks) \
|
|
|
|
({ \
|
|
|
|
for (size_t i = 0; i < (num_fields); i++) { \
|
|
|
|
typeof(&(fields)[0]) field = &(fields)[i]; \
|
|
|
|
u64 uval; \
|
|
|
|
\
|
|
|
|
uval = ustruct_field_to_u64(ustruct, field->offset, field->size); \
|
|
|
|
\
|
|
|
|
__pack(pbuf, uval, field->startbit, field->endbit, \
|
|
|
|
pbuflen, quirks); \
|
|
|
|
} \
|
|
|
|
})
|
|
|
|
|
|
|
|
#define __unpack_fields(pbuf, pbuflen, ustruct, fields, num_fields, quirks) \
|
|
|
|
({ \
|
|
|
|
for (size_t i = 0; i < (num_fields); i++) { \
|
|
|
|
typeof(&(fields)[0]) field = &fields[i]; \
|
|
|
|
u64 uval; \
|
|
|
|
\
|
|
|
|
__unpack(pbuf, &uval, field->startbit, field->endbit, \
|
|
|
|
pbuflen, quirks); \
|
|
|
|
\
|
|
|
|
u64_to_ustruct_field(ustruct, field->offset, field->size, uval); \
|
|
|
|
} \
|
|
|
|
})
|
|
|
|
|
lib: packing: adjust definitions and implementation for arbitrary buffer lengths
Jacob Keller has a use case for packing() in the intel/ice networking
driver, but it cannot be used as-is.
Simply put, the API quirks for LSW32_IS_FIRST and LITTLE_ENDIAN are
naively implemented with the undocumented assumption that the buffer
length must be a multiple of 4. All calculations of group offsets and
offsets of bytes within groups assume that this is the case. But in the
ice case, this does not hold true. For example, packing into a buffer
of 22 bytes would yield wrong results, but pretending it was a 24 byte
buffer would work.
Rather than requiring such hacks, and leaving a big question mark when
it comes to discontinuities in the accessible bit fields of such buffer,
we should extend the packing API to support this use case.
It turns out that we can keep the design in terms of groups of 4 bytes,
but also make it work if the total length is not a multiple of 4.
Just like before, imagine the buffer as a big number, and its most
significant bytes (the ones that would make up to a multiple of 4) are
missing. Thus, with a big endian (no quirks) interpretation of the
buffer, those most significant bytes would be absent from the beginning
of the buffer, and with a LSW32_IS_FIRST interpretation, they would be
absent from the end of the buffer. The LITTLE_ENDIAN quirk, in the
packing() API world, only affects byte ordering within groups of 4.
Thus, it does not change which bytes are missing. Only the significance
of the remaining bytes within the (smaller) group.
No change intended for buffer sizes which are multiples of 4. Tested
with the sja1105 driver and with downstream unit tests.
Link: https://lore.kernel.org/netdev/a0338310-e66c-497c-bc1f-a597e50aa3ff@intel.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-2-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-02 14:51:51 -07:00
|
|
|
/**
|
|
|
|
* calculate_box_addr - Determine physical location of byte in buffer
|
|
|
|
* @box: Index of byte within buffer seen as a logical big-endian big number
|
|
|
|
* @len: Size of buffer in bytes
|
|
|
|
* @quirks: mask of QUIRK_LSW32_IS_FIRST and QUIRK_LITTLE_ENDIAN
|
|
|
|
*
|
|
|
|
* Function interprets the buffer as a @len byte sized big number, and returns
|
|
|
|
* the physical offset of the @box logical octet within it. Internally, it
|
|
|
|
* treats the big number as groups of 4 bytes. If @len is not a multiple of 4,
|
|
|
|
* the last group may be shorter.
|
|
|
|
*
|
|
|
|
* @QUIRK_LSW32_IS_FIRST gives the ordering of groups of 4 octets relative to
|
|
|
|
* each other. If set, the most significant group of 4 octets is last in the
|
|
|
|
* buffer (and may be truncated if @len is not a multiple of 4).
|
|
|
|
*
|
|
|
|
* @QUIRK_LITTLE_ENDIAN gives the ordering of bytes within each group of 4.
|
|
|
|
* If set, the most significant byte is last in the group. If @len takes the
|
|
|
|
* form of 4k+3, the last group will only be able to represent 24 bits, and its
|
|
|
|
* most significant octet is byte 2.
|
|
|
|
*
|
|
|
|
* Return: the physical offset into the buffer corresponding to the logical box.
|
|
|
|
*/
|
2024-10-02 14:51:54 -07:00
|
|
|
static size_t calculate_box_addr(size_t box, size_t len, u8 quirks)
|
lib: packing: adjust definitions and implementation for arbitrary buffer lengths
Jacob Keller has a use case for packing() in the intel/ice networking
driver, but it cannot be used as-is.
Simply put, the API quirks for LSW32_IS_FIRST and LITTLE_ENDIAN are
naively implemented with the undocumented assumption that the buffer
length must be a multiple of 4. All calculations of group offsets and
offsets of bytes within groups assume that this is the case. But in the
ice case, this does not hold true. For example, packing into a buffer
of 22 bytes would yield wrong results, but pretending it was a 24 byte
buffer would work.
Rather than requiring such hacks, and leaving a big question mark when
it comes to discontinuities in the accessible bit fields of such buffer,
we should extend the packing API to support this use case.
It turns out that we can keep the design in terms of groups of 4 bytes,
but also make it work if the total length is not a multiple of 4.
Just like before, imagine the buffer as a big number, and its most
significant bytes (the ones that would make up to a multiple of 4) are
missing. Thus, with a big endian (no quirks) interpretation of the
buffer, those most significant bytes would be absent from the beginning
of the buffer, and with a LSW32_IS_FIRST interpretation, they would be
absent from the end of the buffer. The LITTLE_ENDIAN quirk, in the
packing() API world, only affects byte ordering within groups of 4.
Thus, it does not change which bytes are missing. Only the significance
of the remaining bytes within the (smaller) group.
No change intended for buffer sizes which are multiples of 4. Tested
with the sja1105 driver and with downstream unit tests.
Link: https://lore.kernel.org/netdev/a0338310-e66c-497c-bc1f-a597e50aa3ff@intel.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-2-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-02 14:51:51 -07:00
|
|
|
{
|
|
|
|
size_t offset_of_group, offset_in_group, this_group = box / 4;
|
|
|
|
size_t group_size;
|
|
|
|
|
|
|
|
if (quirks & QUIRK_LSW32_IS_FIRST)
|
|
|
|
offset_of_group = this_group * 4;
|
|
|
|
else
|
|
|
|
offset_of_group = len - ((this_group + 1) * 4);
|
|
|
|
|
|
|
|
group_size = min(4, len - offset_of_group);
|
|
|
|
|
|
|
|
if (quirks & QUIRK_LITTLE_ENDIAN)
|
|
|
|
offset_in_group = box - this_group * 4;
|
|
|
|
else
|
|
|
|
offset_in_group = group_size - (box - this_group * 4) - 1;
|
|
|
|
|
|
|
|
return offset_of_group + offset_in_group;
|
|
|
|
}
|
|
|
|
|
2024-12-10 12:27:10 -08:00
|
|
|
static void __pack(void *pbuf, u64 uval, size_t startbit, size_t endbit,
|
|
|
|
size_t pbuflen, u8 quirks)
|
2019-05-02 23:23:29 +03:00
|
|
|
{
|
|
|
|
/* Logical byte indices corresponding to the
|
|
|
|
* start and end of the field.
|
|
|
|
*/
|
2024-12-10 12:27:10 -08:00
|
|
|
int plogical_first_u8 = startbit / BITS_PER_BYTE;
|
|
|
|
int plogical_last_u8 = endbit / BITS_PER_BYTE;
|
2024-12-10 12:27:11 -08:00
|
|
|
int value_width = startbit - endbit + 1;
|
2024-12-10 12:27:10 -08:00
|
|
|
int box;
|
2019-05-02 23:23:29 +03:00
|
|
|
|
2024-12-10 12:27:11 -08:00
|
|
|
/* Check if "uval" fits in "value_width" bits.
|
|
|
|
* The test only works for value_width < 64, but in the latter case,
|
|
|
|
* any 64-bit uval will surely fit.
|
|
|
|
*/
|
|
|
|
WARN(value_width < 64 && uval >= (1ull << value_width),
|
|
|
|
"Cannot store 0x%llx inside bits %zu-%zu - will truncate\n",
|
|
|
|
uval, startbit, endbit);
|
|
|
|
|
2019-05-02 23:23:29 +03:00
|
|
|
/* Iterate through an idealistic view of the pbuf as an u64 with
|
|
|
|
* no quirks, u8 by u8 (aligned at u8 boundaries), from high to low
|
|
|
|
* logical bit significance. "box" denotes the current logical u8.
|
|
|
|
*/
|
|
|
|
for (box = plogical_first_u8; box >= plogical_last_u8; box--) {
|
|
|
|
/* Bit indices into the currently accessed 8-bit box */
|
2024-10-02 14:51:54 -07:00
|
|
|
size_t box_start_bit, box_end_bit, box_addr;
|
2019-05-02 23:23:29 +03:00
|
|
|
u8 box_mask;
|
|
|
|
/* Corresponding bits from the unpacked u64 parameter */
|
2024-10-02 14:51:54 -07:00
|
|
|
size_t proj_start_bit, proj_end_bit;
|
2019-05-02 23:23:29 +03:00
|
|
|
u64 proj_mask;
|
2024-10-02 14:51:54 -07:00
|
|
|
u64 pval;
|
2019-05-02 23:23:29 +03:00
|
|
|
|
|
|
|
/* This u8 may need to be accessed in its entirety
|
|
|
|
* (from bit 7 to bit 0), or not, depending on the
|
|
|
|
* input arguments startbit and endbit.
|
|
|
|
*/
|
|
|
|
if (box == plogical_first_u8)
|
2024-10-02 14:51:58 -07:00
|
|
|
box_start_bit = startbit % BITS_PER_BYTE;
|
2019-05-02 23:23:29 +03:00
|
|
|
else
|
|
|
|
box_start_bit = 7;
|
|
|
|
if (box == plogical_last_u8)
|
2024-10-02 14:51:58 -07:00
|
|
|
box_end_bit = endbit % BITS_PER_BYTE;
|
2019-05-02 23:23:29 +03:00
|
|
|
else
|
|
|
|
box_end_bit = 0;
|
|
|
|
|
|
|
|
/* We have determined the box bit start and end.
|
|
|
|
* Now we calculate where this (masked) u8 box would fit
|
|
|
|
* in the unpacked (CPU-readable) u64 - the u8 box's
|
|
|
|
* projection onto the unpacked u64. Though the
|
|
|
|
* box is u8, the projection is u64 because it may fall
|
|
|
|
* anywhere within the unpacked u64.
|
|
|
|
*/
|
2024-10-02 14:51:58 -07:00
|
|
|
proj_start_bit = ((box * BITS_PER_BYTE) + box_start_bit) - endbit;
|
|
|
|
proj_end_bit = ((box * BITS_PER_BYTE) + box_end_bit) - endbit;
|
2019-05-02 23:23:29 +03:00
|
|
|
proj_mask = GENMASK_ULL(proj_start_bit, proj_end_bit);
|
2024-10-02 14:51:59 -07:00
|
|
|
box_mask = GENMASK(box_start_bit, box_end_bit);
|
2019-05-02 23:23:29 +03:00
|
|
|
|
|
|
|
/* Determine the offset of the u8 box inside the pbuf,
|
|
|
|
* adjusted for quirks. The adjusted box_addr will be used for
|
|
|
|
* effective addressing inside the pbuf (so it's not
|
|
|
|
* logical any longer).
|
|
|
|
*/
|
lib: packing: adjust definitions and implementation for arbitrary buffer lengths
Jacob Keller has a use case for packing() in the intel/ice networking
driver, but it cannot be used as-is.
Simply put, the API quirks for LSW32_IS_FIRST and LITTLE_ENDIAN are
naively implemented with the undocumented assumption that the buffer
length must be a multiple of 4. All calculations of group offsets and
offsets of bytes within groups assume that this is the case. But in the
ice case, this does not hold true. For example, packing into a buffer
of 22 bytes would yield wrong results, but pretending it was a 24 byte
buffer would work.
Rather than requiring such hacks, and leaving a big question mark when
it comes to discontinuities in the accessible bit fields of such buffer,
we should extend the packing API to support this use case.
It turns out that we can keep the design in terms of groups of 4 bytes,
but also make it work if the total length is not a multiple of 4.
Just like before, imagine the buffer as a big number, and its most
significant bytes (the ones that would make up to a multiple of 4) are
missing. Thus, with a big endian (no quirks) interpretation of the
buffer, those most significant bytes would be absent from the beginning
of the buffer, and with a LSW32_IS_FIRST interpretation, they would be
absent from the end of the buffer. The LITTLE_ENDIAN quirk, in the
packing() API world, only affects byte ordering within groups of 4.
Thus, it does not change which bytes are missing. Only the significance
of the remaining bytes within the (smaller) group.
No change intended for buffer sizes which are multiples of 4. Tested
with the sja1105 driver and with downstream unit tests.
Link: https://lore.kernel.org/netdev/a0338310-e66c-497c-bc1f-a597e50aa3ff@intel.com/
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Tested-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-2-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-02 14:51:51 -07:00
|
|
|
box_addr = calculate_box_addr(box, pbuflen, quirks);
|
2019-05-02 23:23:29 +03:00
|
|
|
|
2024-10-02 14:51:54 -07:00
|
|
|
/* Write to pbuf, read from uval */
|
|
|
|
pval = uval & proj_mask;
|
|
|
|
pval >>= proj_end_bit;
|
|
|
|
pval <<= box_end_bit;
|
lib: packing: fix QUIRK_MSB_ON_THE_RIGHT behavior
The QUIRK_MSB_ON_THE_RIGHT quirk is intended to modify pack() and unpack()
so that the most significant bit of each byte in the packed layout is on
the right.
The way the quirk is currently implemented is broken whenever the packing
code packs or unpacks any value that is not exactly a full byte.
The broken behavior can occur when packing any values smaller than one
byte, when packing any value that is not exactly a whole number of bytes,
or when the packing is not aligned to a byte boundary.
This quirk is documented in the following way:
1. Normally (no quirks), we would do it like this:
::
63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
7 6 5 4
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
3 2 1 0
<snip>
2. If QUIRK_MSB_ON_THE_RIGHT is set, we do it like this:
::
56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
7 6 5 4
24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7
3 2 1 0
That is, QUIRK_MSB_ON_THE_RIGHT does not affect byte positioning, but
inverts bit offsets inside a byte.
Essentially, the mapping for physical bit offsets should be reserved for a
given byte within the payload. This reversal should be fixed to the bytes
in the packing layout.
The logic to implement this quirk is handled within the
adjust_for_msb_right_quirk() function. This function does not work properly
when dealing with the bytes that contain only a partial amount of data.
In particular, consider trying to pack or unpack the range 53-44. We should
always be mapping the bits from the logical ordering to their physical
ordering in the same way, regardless of what sequence of bits we are
unpacking.
This, we should grab the following logical bits:
Logical: 55 54 53 52 51 50 49 48 47 45 44 43 42 41 40 39
^ ^ ^ ^ ^ ^ ^ ^ ^
And pack them into the physical bits:
Physical: 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47
Logical: 48 49 50 51 52 53 44 45 46 47
^ ^ ^ ^ ^ ^ ^ ^ ^ ^
The current logic in adjust_for_msb_right_quirk is broken. I believe it is
intending to map according to the following:
Physical: 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47
Logical: 48 49 50 51 52 53 44 45 46 47
^ ^ ^ ^ ^ ^ ^ ^ ^ ^
That is, it tries to keep the bits at the start and end of a packing
together. This is wrong, as it makes the packing change what bit is being
mapped to what based on which bits you're currently packing or unpacking.
Worse, the actual calculations within adjust_for_msb_right_quirk don't make
sense.
Consider the case when packing the last byte of an unaligned packing. It
might have a start bit of 7 and an end bit of 5. This would have a width of
3 bits. The new_start_bit will be calculated as the width - the box_end_bit
- 1. This will underflow and produce a negative value, which will
ultimate result in generating a new box_mask of all 0s.
For any other values, the result of the calculations of the
new_box_end_bit, new_box_start_bit, and the new box_mask will result in the
exact same values for the box_end_bit, box_start_bit, and box_mask. This
makes the calculations completely irrelevant.
If box_end_bit is 0, and box_start_bit is 7, then the entire function of
adjust_for_msb_right_quirk will boil down to just:
*to_write = bitrev8(*to_write)
The other adjustments are attempting (incorrectly) to keep the bits in the
same place but just reversed. This is not the right behavior even if
implemented correctly, as it leaves the mapping dependent on the bit values
being packed or unpacked.
Remove adjust_for_msb_right_quirk() and just use bitrev8 to reverse the
byte order when interacting with the packed data.
In particular, for packing, we need to reverse both the box_mask and the
physical value being packed. This is done after shifting the value by
box_end_bit so that the reversed mapping is always aligned to the physical
buffer byte boundary. The box_mask is reversed as we're about to use it to
clear any stale bits in the physical buffer at this block.
For unpacking, we need to reverse the contents of the physical buffer
*before* masking with the box_mask. This is critical, as the box_mask is a
logical mask of the bit layout before handling the QUIRK_MSB_ON_THE_RIGHT.
Add several new tests which cover this behavior. These tests will fail
without the fix and pass afterwards. Note that no current drivers make use
of QUIRK_MSB_ON_THE_RIGHT. I suspect this is why there have been no reports
of this inconsistency before.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-8-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-02 14:51:57 -07:00
|
|
|
|
|
|
|
if (quirks & QUIRK_MSB_ON_THE_RIGHT) {
|
|
|
|
pval = bitrev8(pval);
|
|
|
|
box_mask = bitrev8(box_mask);
|
|
|
|
}
|
|
|
|
|
2024-10-02 14:51:54 -07:00
|
|
|
((u8 *)pbuf)[box_addr] &= ~box_mask;
|
|
|
|
((u8 *)pbuf)[box_addr] |= pval;
|
2019-05-02 23:23:29 +03:00
|
|
|
}
|
|
|
|
}
|
|
|
|
|
2024-10-02 14:51:53 -07:00
|
|
|
/**
|
2024-12-10 12:27:10 -08:00
|
|
|
* pack - Pack u64 number into bitfield of buffer.
|
2024-10-02 14:51:53 -07:00
|
|
|
*
|
|
|
|
* @pbuf: Pointer to a buffer holding the packed value.
|
2024-12-10 12:27:10 -08:00
|
|
|
* @uval: CPU-readable unpacked value to pack.
|
2024-10-02 14:51:53 -07:00
|
|
|
* @startbit: The index (in logical notation, compensated for quirks) where
|
|
|
|
* the packed value starts within pbuf. Must be larger than, or
|
|
|
|
* equal to, endbit.
|
|
|
|
* @endbit: The index (in logical notation, compensated for quirks) where
|
|
|
|
* the packed value ends within pbuf. Must be smaller than, or equal
|
|
|
|
* to, startbit.
|
|
|
|
* @pbuflen: The length in bytes of the packed buffer pointed to by @pbuf.
|
|
|
|
* @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
|
|
|
|
* QUIRK_MSB_ON_THE_RIGHT.
|
|
|
|
*
|
|
|
|
* Return: 0 on success, EINVAL or ERANGE if called incorrectly. Assuming
|
2024-12-10 12:27:10 -08:00
|
|
|
* correct usage, return code may be discarded. The @pbuf memory will
|
|
|
|
* be modified on success.
|
2024-10-02 14:51:53 -07:00
|
|
|
*/
|
2024-12-10 12:27:10 -08:00
|
|
|
int pack(void *pbuf, u64 uval, size_t startbit, size_t endbit, size_t pbuflen,
|
|
|
|
u8 quirks)
|
2024-10-02 14:51:53 -07:00
|
|
|
{
|
2024-10-02 14:51:54 -07:00
|
|
|
/* startbit is expected to be larger than endbit, and both are
|
|
|
|
* expected to be within the logically addressable range of the buffer.
|
|
|
|
*/
|
2024-10-02 14:51:58 -07:00
|
|
|
if (unlikely(startbit < endbit || startbit >= BITS_PER_BYTE * pbuflen))
|
2024-10-02 14:51:54 -07:00
|
|
|
/* Invalid function call */
|
|
|
|
return -EINVAL;
|
|
|
|
|
2024-12-10 12:27:11 -08:00
|
|
|
if (unlikely(startbit - endbit >= 64))
|
2024-12-10 12:27:10 -08:00
|
|
|
return -ERANGE;
|
|
|
|
|
|
|
|
__pack(pbuf, uval, startbit, endbit, pbuflen, quirks);
|
|
|
|
|
|
|
|
return 0;
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(pack);
|
|
|
|
|
|
|
|
static void __unpack(const void *pbuf, u64 *uval, size_t startbit, size_t endbit,
|
|
|
|
size_t pbuflen, u8 quirks)
|
|
|
|
{
|
|
|
|
/* Logical byte indices corresponding to the
|
|
|
|
* start and end of the field.
|
|
|
|
*/
|
|
|
|
int plogical_first_u8 = startbit / BITS_PER_BYTE;
|
|
|
|
int plogical_last_u8 = endbit / BITS_PER_BYTE;
|
|
|
|
int box;
|
|
|
|
|
2024-10-02 14:51:54 -07:00
|
|
|
/* Initialize parameter */
|
|
|
|
*uval = 0;
|
|
|
|
|
|
|
|
/* Iterate through an idealistic view of the pbuf as an u64 with
|
|
|
|
* no quirks, u8 by u8 (aligned at u8 boundaries), from high to low
|
|
|
|
* logical bit significance. "box" denotes the current logical u8.
|
|
|
|
*/
|
|
|
|
for (box = plogical_first_u8; box >= plogical_last_u8; box--) {
|
|
|
|
/* Bit indices into the currently accessed 8-bit box */
|
|
|
|
size_t box_start_bit, box_end_bit, box_addr;
|
|
|
|
u8 box_mask;
|
|
|
|
/* Corresponding bits from the unpacked u64 parameter */
|
|
|
|
size_t proj_start_bit, proj_end_bit;
|
|
|
|
u64 proj_mask;
|
|
|
|
u64 pval;
|
|
|
|
|
|
|
|
/* This u8 may need to be accessed in its entirety
|
|
|
|
* (from bit 7 to bit 0), or not, depending on the
|
|
|
|
* input arguments startbit and endbit.
|
|
|
|
*/
|
|
|
|
if (box == plogical_first_u8)
|
2024-10-02 14:51:58 -07:00
|
|
|
box_start_bit = startbit % BITS_PER_BYTE;
|
2024-10-02 14:51:54 -07:00
|
|
|
else
|
|
|
|
box_start_bit = 7;
|
|
|
|
if (box == plogical_last_u8)
|
2024-10-02 14:51:58 -07:00
|
|
|
box_end_bit = endbit % BITS_PER_BYTE;
|
2024-10-02 14:51:54 -07:00
|
|
|
else
|
|
|
|
box_end_bit = 0;
|
|
|
|
|
|
|
|
/* We have determined the box bit start and end.
|
|
|
|
* Now we calculate where this (masked) u8 box would fit
|
|
|
|
* in the unpacked (CPU-readable) u64 - the u8 box's
|
|
|
|
* projection onto the unpacked u64. Though the
|
|
|
|
* box is u8, the projection is u64 because it may fall
|
|
|
|
* anywhere within the unpacked u64.
|
|
|
|
*/
|
2024-10-02 14:51:58 -07:00
|
|
|
proj_start_bit = ((box * BITS_PER_BYTE) + box_start_bit) - endbit;
|
|
|
|
proj_end_bit = ((box * BITS_PER_BYTE) + box_end_bit) - endbit;
|
2024-10-02 14:51:54 -07:00
|
|
|
proj_mask = GENMASK_ULL(proj_start_bit, proj_end_bit);
|
2024-10-02 14:51:59 -07:00
|
|
|
box_mask = GENMASK(box_start_bit, box_end_bit);
|
2024-10-02 14:51:54 -07:00
|
|
|
|
|
|
|
/* Determine the offset of the u8 box inside the pbuf,
|
|
|
|
* adjusted for quirks. The adjusted box_addr will be used for
|
|
|
|
* effective addressing inside the pbuf (so it's not
|
|
|
|
* logical any longer).
|
|
|
|
*/
|
|
|
|
box_addr = calculate_box_addr(box, pbuflen, quirks);
|
|
|
|
|
|
|
|
/* Read from pbuf, write to uval */
|
lib: packing: fix QUIRK_MSB_ON_THE_RIGHT behavior
The QUIRK_MSB_ON_THE_RIGHT quirk is intended to modify pack() and unpack()
so that the most significant bit of each byte in the packed layout is on
the right.
The way the quirk is currently implemented is broken whenever the packing
code packs or unpacks any value that is not exactly a full byte.
The broken behavior can occur when packing any values smaller than one
byte, when packing any value that is not exactly a whole number of bytes,
or when the packing is not aligned to a byte boundary.
This quirk is documented in the following way:
1. Normally (no quirks), we would do it like this:
::
63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
7 6 5 4
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
3 2 1 0
<snip>
2. If QUIRK_MSB_ON_THE_RIGHT is set, we do it like this:
::
56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
7 6 5 4
24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7
3 2 1 0
That is, QUIRK_MSB_ON_THE_RIGHT does not affect byte positioning, but
inverts bit offsets inside a byte.
Essentially, the mapping for physical bit offsets should be reserved for a
given byte within the payload. This reversal should be fixed to the bytes
in the packing layout.
The logic to implement this quirk is handled within the
adjust_for_msb_right_quirk() function. This function does not work properly
when dealing with the bytes that contain only a partial amount of data.
In particular, consider trying to pack or unpack the range 53-44. We should
always be mapping the bits from the logical ordering to their physical
ordering in the same way, regardless of what sequence of bits we are
unpacking.
This, we should grab the following logical bits:
Logical: 55 54 53 52 51 50 49 48 47 45 44 43 42 41 40 39
^ ^ ^ ^ ^ ^ ^ ^ ^
And pack them into the physical bits:
Physical: 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47
Logical: 48 49 50 51 52 53 44 45 46 47
^ ^ ^ ^ ^ ^ ^ ^ ^ ^
The current logic in adjust_for_msb_right_quirk is broken. I believe it is
intending to map according to the following:
Physical: 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47
Logical: 48 49 50 51 52 53 44 45 46 47
^ ^ ^ ^ ^ ^ ^ ^ ^ ^
That is, it tries to keep the bits at the start and end of a packing
together. This is wrong, as it makes the packing change what bit is being
mapped to what based on which bits you're currently packing or unpacking.
Worse, the actual calculations within adjust_for_msb_right_quirk don't make
sense.
Consider the case when packing the last byte of an unaligned packing. It
might have a start bit of 7 and an end bit of 5. This would have a width of
3 bits. The new_start_bit will be calculated as the width - the box_end_bit
- 1. This will underflow and produce a negative value, which will
ultimate result in generating a new box_mask of all 0s.
For any other values, the result of the calculations of the
new_box_end_bit, new_box_start_bit, and the new box_mask will result in the
exact same values for the box_end_bit, box_start_bit, and box_mask. This
makes the calculations completely irrelevant.
If box_end_bit is 0, and box_start_bit is 7, then the entire function of
adjust_for_msb_right_quirk will boil down to just:
*to_write = bitrev8(*to_write)
The other adjustments are attempting (incorrectly) to keep the bits in the
same place but just reversed. This is not the right behavior even if
implemented correctly, as it leaves the mapping dependent on the bit values
being packed or unpacked.
Remove adjust_for_msb_right_quirk() and just use bitrev8 to reverse the
byte order when interacting with the packed data.
In particular, for packing, we need to reverse both the box_mask and the
physical value being packed. This is done after shifting the value by
box_end_bit so that the reversed mapping is always aligned to the physical
buffer byte boundary. The box_mask is reversed as we're about to use it to
clear any stale bits in the physical buffer at this block.
For unpacking, we need to reverse the contents of the physical buffer
*before* masking with the box_mask. This is critical, as the box_mask is a
logical mask of the bit layout before handling the QUIRK_MSB_ON_THE_RIGHT.
Add several new tests which cover this behavior. These tests will fail
without the fix and pass afterwards. Note that no current drivers make use
of QUIRK_MSB_ON_THE_RIGHT. I suspect this is why there have been no reports
of this inconsistency before.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-8-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-02 14:51:57 -07:00
|
|
|
pval = ((u8 *)pbuf)[box_addr];
|
|
|
|
|
2024-10-02 14:51:54 -07:00
|
|
|
if (quirks & QUIRK_MSB_ON_THE_RIGHT)
|
lib: packing: fix QUIRK_MSB_ON_THE_RIGHT behavior
The QUIRK_MSB_ON_THE_RIGHT quirk is intended to modify pack() and unpack()
so that the most significant bit of each byte in the packed layout is on
the right.
The way the quirk is currently implemented is broken whenever the packing
code packs or unpacks any value that is not exactly a full byte.
The broken behavior can occur when packing any values smaller than one
byte, when packing any value that is not exactly a whole number of bytes,
or when the packing is not aligned to a byte boundary.
This quirk is documented in the following way:
1. Normally (no quirks), we would do it like this:
::
63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32
7 6 5 4
31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0
3 2 1 0
<snip>
2. If QUIRK_MSB_ON_THE_RIGHT is set, we do it like this:
::
56 57 58 59 60 61 62 63 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47 32 33 34 35 36 37 38 39
7 6 5 4
24 25 26 27 28 29 30 31 16 17 18 19 20 21 22 23 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7
3 2 1 0
That is, QUIRK_MSB_ON_THE_RIGHT does not affect byte positioning, but
inverts bit offsets inside a byte.
Essentially, the mapping for physical bit offsets should be reserved for a
given byte within the payload. This reversal should be fixed to the bytes
in the packing layout.
The logic to implement this quirk is handled within the
adjust_for_msb_right_quirk() function. This function does not work properly
when dealing with the bytes that contain only a partial amount of data.
In particular, consider trying to pack or unpack the range 53-44. We should
always be mapping the bits from the logical ordering to their physical
ordering in the same way, regardless of what sequence of bits we are
unpacking.
This, we should grab the following logical bits:
Logical: 55 54 53 52 51 50 49 48 47 45 44 43 42 41 40 39
^ ^ ^ ^ ^ ^ ^ ^ ^
And pack them into the physical bits:
Physical: 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47
Logical: 48 49 50 51 52 53 44 45 46 47
^ ^ ^ ^ ^ ^ ^ ^ ^ ^
The current logic in adjust_for_msb_right_quirk is broken. I believe it is
intending to map according to the following:
Physical: 48 49 50 51 52 53 54 55 40 41 42 43 44 45 46 47
Logical: 48 49 50 51 52 53 44 45 46 47
^ ^ ^ ^ ^ ^ ^ ^ ^ ^
That is, it tries to keep the bits at the start and end of a packing
together. This is wrong, as it makes the packing change what bit is being
mapped to what based on which bits you're currently packing or unpacking.
Worse, the actual calculations within adjust_for_msb_right_quirk don't make
sense.
Consider the case when packing the last byte of an unaligned packing. It
might have a start bit of 7 and an end bit of 5. This would have a width of
3 bits. The new_start_bit will be calculated as the width - the box_end_bit
- 1. This will underflow and produce a negative value, which will
ultimate result in generating a new box_mask of all 0s.
For any other values, the result of the calculations of the
new_box_end_bit, new_box_start_bit, and the new box_mask will result in the
exact same values for the box_end_bit, box_start_bit, and box_mask. This
makes the calculations completely irrelevant.
If box_end_bit is 0, and box_start_bit is 7, then the entire function of
adjust_for_msb_right_quirk will boil down to just:
*to_write = bitrev8(*to_write)
The other adjustments are attempting (incorrectly) to keep the bits in the
same place but just reversed. This is not the right behavior even if
implemented correctly, as it leaves the mapping dependent on the bit values
being packed or unpacked.
Remove adjust_for_msb_right_quirk() and just use bitrev8 to reverse the
byte order when interacting with the packed data.
In particular, for packing, we need to reverse both the box_mask and the
physical value being packed. This is done after shifting the value by
box_end_bit so that the reversed mapping is always aligned to the physical
buffer byte boundary. The box_mask is reversed as we're about to use it to
clear any stale bits in the physical buffer at this block.
For unpacking, we need to reverse the contents of the physical buffer
*before* masking with the box_mask. This is critical, as the box_mask is a
logical mask of the bit layout before handling the QUIRK_MSB_ON_THE_RIGHT.
Add several new tests which cover this behavior. These tests will fail
without the fix and pass afterwards. Note that no current drivers make use
of QUIRK_MSB_ON_THE_RIGHT. I suspect this is why there have been no reports
of this inconsistency before.
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Link: https://patch.msgid.link/20241002-packing-kunit-tests-and-split-pack-unpack-v2-8-8373e551eae3@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-10-02 14:51:57 -07:00
|
|
|
pval = bitrev8(pval);
|
|
|
|
|
|
|
|
pval &= box_mask;
|
2024-10-02 14:51:54 -07:00
|
|
|
|
|
|
|
pval >>= box_end_bit;
|
|
|
|
pval <<= proj_end_bit;
|
|
|
|
*uval &= ~proj_mask;
|
|
|
|
*uval |= pval;
|
|
|
|
}
|
2024-12-10 12:27:10 -08:00
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* unpack - Unpack u64 number from packed buffer.
|
|
|
|
*
|
|
|
|
* @pbuf: Pointer to a buffer holding the packed value.
|
|
|
|
* @uval: Pointer to an u64 holding the unpacked value.
|
|
|
|
* @startbit: The index (in logical notation, compensated for quirks) where
|
|
|
|
* the packed value starts within pbuf. Must be larger than, or
|
|
|
|
* equal to, endbit.
|
|
|
|
* @endbit: The index (in logical notation, compensated for quirks) where
|
|
|
|
* the packed value ends within pbuf. Must be smaller than, or equal
|
|
|
|
* to, startbit.
|
|
|
|
* @pbuflen: The length in bytes of the packed buffer pointed to by @pbuf.
|
|
|
|
* @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
|
|
|
|
* QUIRK_MSB_ON_THE_RIGHT.
|
|
|
|
*
|
|
|
|
* Return: 0 on success, EINVAL or ERANGE if called incorrectly. Assuming
|
|
|
|
* correct usage, return code may be discarded. The @uval will be
|
|
|
|
* modified on success.
|
|
|
|
*/
|
|
|
|
int unpack(const void *pbuf, u64 *uval, size_t startbit, size_t endbit,
|
|
|
|
size_t pbuflen, u8 quirks)
|
|
|
|
{
|
|
|
|
/* width of the field to access in the pbuf */
|
|
|
|
u64 value_width;
|
|
|
|
|
|
|
|
/* startbit is expected to be larger than endbit, and both are
|
|
|
|
* expected to be within the logically addressable range of the buffer.
|
|
|
|
*/
|
|
|
|
if (startbit < endbit || startbit >= BITS_PER_BYTE * pbuflen)
|
|
|
|
/* Invalid function call */
|
|
|
|
return -EINVAL;
|
|
|
|
|
|
|
|
value_width = startbit - endbit + 1;
|
|
|
|
if (value_width > 64)
|
|
|
|
return -ERANGE;
|
|
|
|
|
|
|
|
__unpack(pbuf, uval, startbit, endbit, pbuflen, quirks);
|
|
|
|
|
2024-10-02 14:51:54 -07:00
|
|
|
return 0;
|
2024-10-02 14:51:53 -07:00
|
|
|
}
|
2024-10-02 14:51:54 -07:00
|
|
|
EXPORT_SYMBOL(unpack);
|
2024-10-02 14:51:53 -07:00
|
|
|
|
|
|
|
/**
|
2024-10-02 14:51:54 -07:00
|
|
|
* packing - Convert numbers (currently u64) between a packed and an unpacked
|
|
|
|
* format. Unpacked means laid out in memory in the CPU's native
|
|
|
|
* understanding of integers, while packed means anything else that
|
|
|
|
* requires translation.
|
2024-10-02 14:51:53 -07:00
|
|
|
*
|
|
|
|
* @pbuf: Pointer to a buffer holding the packed value.
|
|
|
|
* @uval: Pointer to an u64 holding the unpacked value.
|
|
|
|
* @startbit: The index (in logical notation, compensated for quirks) where
|
|
|
|
* the packed value starts within pbuf. Must be larger than, or
|
|
|
|
* equal to, endbit.
|
|
|
|
* @endbit: The index (in logical notation, compensated for quirks) where
|
|
|
|
* the packed value ends within pbuf. Must be smaller than, or equal
|
|
|
|
* to, startbit.
|
|
|
|
* @pbuflen: The length in bytes of the packed buffer pointed to by @pbuf.
|
2024-10-02 14:51:54 -07:00
|
|
|
* @op: If PACK, then uval will be treated as const pointer and copied (packed)
|
|
|
|
* into pbuf, between startbit and endbit.
|
|
|
|
* If UNPACK, then pbuf will be treated as const pointer and the logical
|
|
|
|
* value between startbit and endbit will be copied (unpacked) to uval.
|
2024-10-02 14:51:53 -07:00
|
|
|
* @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
|
|
|
|
* QUIRK_MSB_ON_THE_RIGHT.
|
|
|
|
*
|
2024-10-02 14:51:54 -07:00
|
|
|
* Note: this is deprecated, prefer to use pack() or unpack() in new code.
|
|
|
|
*
|
2024-10-02 14:51:53 -07:00
|
|
|
* Return: 0 on success, EINVAL or ERANGE if called incorrectly. Assuming
|
2024-10-02 14:51:54 -07:00
|
|
|
* correct usage, return code may be discarded.
|
|
|
|
* If op is PACK, pbuf is modified.
|
|
|
|
* If op is UNPACK, uval is modified.
|
2024-10-02 14:51:53 -07:00
|
|
|
*/
|
2024-10-02 14:51:54 -07:00
|
|
|
int packing(void *pbuf, u64 *uval, int startbit, int endbit, size_t pbuflen,
|
|
|
|
enum packing_op op, u8 quirks)
|
2024-10-02 14:51:53 -07:00
|
|
|
{
|
2024-10-02 14:51:54 -07:00
|
|
|
if (op == PACK)
|
|
|
|
return pack(pbuf, *uval, startbit, endbit, pbuflen, quirks);
|
|
|
|
|
|
|
|
return unpack(pbuf, uval, startbit, endbit, pbuflen, quirks);
|
2024-10-02 14:51:53 -07:00
|
|
|
}
|
2024-10-02 14:51:54 -07:00
|
|
|
EXPORT_SYMBOL(packing);
|
2024-10-02 14:51:53 -07:00
|
|
|
|
lib: packing: add pack_fields() and unpack_fields()
This is new API which caters to the following requirements:
- Pack or unpack a large number of fields to/from a buffer with a small
code footprint. The current alternative is to open-code a large number
of calls to pack() and unpack(), or to use packing() to reduce that
number to half. But packing() is not const-correct.
- Use unpacked numbers stored in variables smaller than u64. This
reduces the rodata footprint of the stored field arrays.
- Perform error checking at compile time, rather than runtime, and return
void from the API functions. Because the C preprocessor can't generate
variable length code (loops), this is a bit tricky to do with macros.
To handle this, implement macros which sanity check the packed field
definitions based on their size. Finally, a single macro with a chain of
__builtin_choose_expr() is used to select the appropriate macros. We
enforce the use of ascending or descending order to avoid O(N^2) scaling
when checking for overlap. Note that the macros are written with care to
ensure that the compilers can correctly evaluate the resulting code at
compile time. In particular, care was taken with avoiding too many nested
statement expressions. Nested statement expressions trip up some
compilers, especially when passing down variables created in previous
statement expressions.
There are two key design choices intended to keep the overall macro code
size small. First, the definition of each CHECK_PACKED_FIELDS_N macro is
implemented recursively, by calling the N-1 macro. This avoids needing
the code to repeat multiple times.
Second, the CHECK_PACKED_FIELD macro enforces that the fields in the
array are sorted in order. This allows checking for overlap only with
neighboring fields, rather than the general overlap case where each field
would need to be checked against other fields.
The overlap checks use the first two fields to determine the order of the
remaining fields, thus allowing either ascending or descending order.
This enables drivers the flexibility to keep the fields ordered in which
ever order most naturally fits their hardware design and its associated
documentation.
The CHECK_PACKED_FIELDS macro is directly called from within pack_fields
and unpack_fields, ensuring that all drivers using the API receive the
benefits of the compile-time checks. Users do not need to directly call
any of the macros directly.
The CHECK_PACKED_FIELDS and its helper macros CHECK_PACKED_FIELDS_(0..50)
are generated using a simple C program in scripts/gen_packed_field_checks.c
This program can be compiled on demand and executed to generate the
macro code in include/linux/packing.h. This will aid in the event that a
driver needs more than 50 fields. The generator can be updated with a new
size, and used to update the packing.h header file. In practice, the ice
driver will need to support 27 fields, and the sja1105 driver will need
to support 0 fields. This on-demand generation avoids the need to modify
Kbuild. We do not anticipate the maximum number of fields to grow very
often.
- Reduced rodata footprint for the storage of the packed field arrays.
To that end, we have struct packed_field_u8 and packed_field_u16, which
define the fields with the associated type. More can be added as
needed (unlikely for now). On these types, the same generic pack_fields()
and unpack_fields() API can be used, thanks to the new C11 _Generic()
selection feature, which can call pack_fields_u8() or pack_fields_16(),
depending on the type of the "fields" array - a simplistic form of
polymorphism. It is evaluated at compile time which function will actually
be called.
Over time, packing() is expected to be completely replaced either with
pack() or with pack_fields().
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Co-developed-by: Jacob Keller <jacob.e.keller@intel.com>
Signed-off-by: Jacob Keller <jacob.e.keller@intel.com>
Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20241210-packing-pack-fields-and-ice-implementation-v10-3-ee56a47479ac@intel.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-12-10 12:27:12 -08:00
|
|
|
static u64 ustruct_field_to_u64(const void *ustruct, size_t field_offset,
|
|
|
|
size_t field_size)
|
|
|
|
{
|
|
|
|
switch (field_size) {
|
|
|
|
case 1:
|
|
|
|
return *((u8 *)(ustruct + field_offset));
|
|
|
|
case 2:
|
|
|
|
return *((u16 *)(ustruct + field_offset));
|
|
|
|
case 4:
|
|
|
|
return *((u32 *)(ustruct + field_offset));
|
|
|
|
default:
|
|
|
|
return *((u64 *)(ustruct + field_offset));
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
static void u64_to_ustruct_field(void *ustruct, size_t field_offset,
|
|
|
|
size_t field_size, u64 uval)
|
|
|
|
{
|
|
|
|
switch (field_size) {
|
|
|
|
case 1:
|
|
|
|
*((u8 *)(ustruct + field_offset)) = uval;
|
|
|
|
break;
|
|
|
|
case 2:
|
|
|
|
*((u16 *)(ustruct + field_offset)) = uval;
|
|
|
|
break;
|
|
|
|
case 4:
|
|
|
|
*((u32 *)(ustruct + field_offset)) = uval;
|
|
|
|
break;
|
|
|
|
default:
|
|
|
|
*((u64 *)(ustruct + field_offset)) = uval;
|
|
|
|
break;
|
|
|
|
}
|
|
|
|
}
|
|
|
|
|
|
|
|
/**
|
|
|
|
* pack_fields_u8 - Pack array of fields
|
|
|
|
*
|
|
|
|
* @pbuf: Pointer to a buffer holding the packed value.
|
|
|
|
* @pbuflen: The length in bytes of the packed buffer pointed to by @pbuf.
|
|
|
|
* @ustruct: Pointer to CPU-readable structure holding the unpacked value.
|
|
|
|
* It is expected (but not checked) that this has the same data type
|
|
|
|
* as all struct packed_field_u8 definitions.
|
|
|
|
* @fields: Array of packed_field_u8 field definition. They must not overlap.
|
|
|
|
* @num_fields: Length of @fields array.
|
|
|
|
* @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
|
|
|
|
* QUIRK_MSB_ON_THE_RIGHT.
|
|
|
|
*
|
|
|
|
* Use the pack_fields() macro instead of calling this directly.
|
|
|
|
*/
|
|
|
|
void pack_fields_u8(void *pbuf, size_t pbuflen, const void *ustruct,
|
|
|
|
const struct packed_field_u8 *fields, size_t num_fields,
|
|
|
|
u8 quirks)
|
|
|
|
{
|
|
|
|
__pack_fields(pbuf, pbuflen, ustruct, fields, num_fields, quirks);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(pack_fields_u8);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* pack_fields_u16 - Pack array of fields
|
|
|
|
*
|
|
|
|
* @pbuf: Pointer to a buffer holding the packed value.
|
|
|
|
* @pbuflen: The length in bytes of the packed buffer pointed to by @pbuf.
|
|
|
|
* @ustruct: Pointer to CPU-readable structure holding the unpacked value.
|
|
|
|
* It is expected (but not checked) that this has the same data type
|
|
|
|
* as all struct packed_field_u16 definitions.
|
|
|
|
* @fields: Array of packed_field_u16 field definitions. They must not overlap.
|
|
|
|
* @num_fields: Length of @fields array.
|
|
|
|
* @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
|
|
|
|
* QUIRK_MSB_ON_THE_RIGHT.
|
|
|
|
*
|
|
|
|
* Use the pack_fields() macro instead of calling this directly.
|
|
|
|
*/
|
|
|
|
void pack_fields_u16(void *pbuf, size_t pbuflen, const void *ustruct,
|
|
|
|
const struct packed_field_u16 *fields, size_t num_fields,
|
|
|
|
u8 quirks)
|
|
|
|
{
|
|
|
|
__pack_fields(pbuf, pbuflen, ustruct, fields, num_fields, quirks);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(pack_fields_u16);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* unpack_fields_u8 - Unpack array of fields
|
|
|
|
*
|
|
|
|
* @pbuf: Pointer to a buffer holding the packed value.
|
|
|
|
* @pbuflen: The length in bytes of the packed buffer pointed to by @pbuf.
|
|
|
|
* @ustruct: Pointer to CPU-readable structure holding the unpacked value.
|
|
|
|
* It is expected (but not checked) that this has the same data type
|
|
|
|
* as all struct packed_field_u8 definitions.
|
|
|
|
* @fields: Array of packed_field_u8 field definitions. They must not overlap.
|
|
|
|
* @num_fields: Length of @fields array.
|
|
|
|
* @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
|
|
|
|
* QUIRK_MSB_ON_THE_RIGHT.
|
|
|
|
*
|
|
|
|
* Use the unpack_fields() macro instead of calling this directly.
|
|
|
|
*/
|
|
|
|
void unpack_fields_u8(const void *pbuf, size_t pbuflen, void *ustruct,
|
|
|
|
const struct packed_field_u8 *fields, size_t num_fields,
|
|
|
|
u8 quirks)
|
|
|
|
{
|
|
|
|
__unpack_fields(pbuf, pbuflen, ustruct, fields, num_fields, quirks);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(unpack_fields_u8);
|
|
|
|
|
|
|
|
/**
|
|
|
|
* unpack_fields_u16 - Unpack array of fields
|
|
|
|
*
|
|
|
|
* @pbuf: Pointer to a buffer holding the packed value.
|
|
|
|
* @pbuflen: The length in bytes of the packed buffer pointed to by @pbuf.
|
|
|
|
* @ustruct: Pointer to CPU-readable structure holding the unpacked value.
|
|
|
|
* It is expected (but not checked) that this has the same data type
|
|
|
|
* as all struct packed_field_u16 definitions.
|
|
|
|
* @fields: Array of packed_field_u16 field definitions. They must not overlap.
|
|
|
|
* @num_fields: Length of @fields array.
|
|
|
|
* @quirks: A bit mask of QUIRK_LITTLE_ENDIAN, QUIRK_LSW32_IS_FIRST and
|
|
|
|
* QUIRK_MSB_ON_THE_RIGHT.
|
|
|
|
*
|
|
|
|
* Use the unpack_fields() macro instead of calling this directly.
|
|
|
|
*/
|
|
|
|
void unpack_fields_u16(const void *pbuf, size_t pbuflen, void *ustruct,
|
|
|
|
const struct packed_field_u16 *fields, size_t num_fields,
|
|
|
|
u8 quirks)
|
|
|
|
{
|
|
|
|
__unpack_fields(pbuf, pbuflen, ustruct, fields, num_fields, quirks);
|
|
|
|
}
|
|
|
|
EXPORT_SYMBOL(unpack_fields_u16);
|
|
|
|
|
2019-05-02 23:23:29 +03:00
|
|
|
MODULE_DESCRIPTION("Generic bitfield packing and unpacking");
|