lib: zstd: Backport fix for in-place decompression

Backport the relevant part of upstream commit 5b266196 [0].

This fixes in-place decompression for x86-64 kernel decompression. It
uses a bound of 131072 + (uncompressed_size >> 8), which can be violated
after upstream commit 6a7ede3d [1], as zstd can use part of the output
buffer as temporary storage, and without this patch needs a bound of
~262144.

The fix is for zstd to detect that the input and output buffers overlap,
so that zstd knows it can't use the overlapping portion of the output
buffer as tempoary storage. If the margin is not large enough, this will
ensure that zstd will fail the decompression, rather than overwriting
part of the input data, and causing corruption.

This fix has been landed upstream and is in release v1.5.4. That commit
also adds unit and fuzz tests to verify that the margin we use is
respected, and correct. That means that the fix is well tested upstream.

I have not been able to reproduce the potential bug in x86-64 kernel
decompression locally, nor have I recieved reports of failures to
decompress the kernel. It is possible that compression saves enough
space to make it very hard for the issue to appear.

I've boot tested the zstd compressed kernel on x86-64 and i386 with this
patch, which uses in-place decompression, and sanity tested zstd compression
in btrfs / squashfs to make sure that we don't see any issues, but other
uses of zstd shouldn't be affected, because they don't use in-place
decompression.

Thanks to Vasily Gorbik <gor@linux.ibm.com> for debugging a related issue
on s390, which was triggered by the same commit, but was a bug in how
__decompress() was called [2]. And to Sasha Levin <sashal@kernel.org>
for the CC alerting me of the issue.

[0] 5b266196a4
[1] 6a7ede3dfc
[2] https://lore.kernel.org/r/patch-1.thread-41c676.git-41c676c2d153.your-ad-here.call-01675030179-ext-9637@work.hours

CC: Vasily Gorbik <gor@linux.ibm.com>
CC: Heiko Carstens <hca@linux.ibm.com>
CC: Sasha Levin <sashal@kernel.org>
CC: Yann Collet <cyan@fb.com>
Signed-off-by: Nick Terrell <terrelln@fb.com>
This commit is contained in:
Nick Terrell 2023-02-15 15:19:17 -08:00
parent 780f6a9afe
commit 038505c41f

View File

@ -798,7 +798,7 @@ static size_t ZSTD_copyRawBlock(void* dst, size_t dstCapacity,
if (srcSize == 0) return 0;
RETURN_ERROR(dstBuffer_null, "");
}
ZSTD_memcpy(dst, src, srcSize);
ZSTD_memmove(dst, src, srcSize);
return srcSize;
}
@ -858,6 +858,7 @@ static size_t ZSTD_decompressFrame(ZSTD_DCtx* dctx,
/* Loop on each block */
while (1) {
BYTE* oBlockEnd = oend;
size_t decodedSize;
blockProperties_t blockProperties;
size_t const cBlockSize = ZSTD_getcBlockSize(ip, remainingSrcSize, &blockProperties);
@ -867,16 +868,34 @@ static size_t ZSTD_decompressFrame(ZSTD_DCtx* dctx,
remainingSrcSize -= ZSTD_blockHeaderSize;
RETURN_ERROR_IF(cBlockSize > remainingSrcSize, srcSize_wrong, "");
if (ip >= op && ip < oBlockEnd) {
/* We are decompressing in-place. Limit the output pointer so that we
* don't overwrite the block that we are currently reading. This will
* fail decompression if the input & output pointers aren't spaced
* far enough apart.
*
* This is important to set, even when the pointers are far enough
* apart, because ZSTD_decompressBlock_internal() can decide to store
* literals in the output buffer, after the block it is decompressing.
* Since we don't want anything to overwrite our input, we have to tell
* ZSTD_decompressBlock_internal to never write past ip.
*
* See ZSTD_allocateLiteralsBuffer() for reference.
*/
oBlockEnd = op + (ip - op);
}
switch(blockProperties.blockType)
{
case bt_compressed:
decodedSize = ZSTD_decompressBlock_internal(dctx, op, (size_t)(oend-op), ip, cBlockSize, /* frame */ 1, not_streaming);
decodedSize = ZSTD_decompressBlock_internal(dctx, op, (size_t)(oBlockEnd-op), ip, cBlockSize, /* frame */ 1, not_streaming);
break;
case bt_raw :
/* Use oend instead of oBlockEnd because this function is safe to overlap. It uses memmove. */
decodedSize = ZSTD_copyRawBlock(op, (size_t)(oend-op), ip, cBlockSize);
break;
case bt_rle :
decodedSize = ZSTD_setRleBlock(op, (size_t)(oend-op), *ip, blockProperties.origSize);
decodedSize = ZSTD_setRleBlock(op, (size_t)(oBlockEnd-op), *ip, blockProperties.origSize);
break;
case bt_reserved :
default: