linux_dsm_epyc7002/lib/lz4/lz4_decompress.c
Sven Schmidt 4e1a33b105 lib: update LZ4 compressor module
Patch series "Update LZ4 compressor module", v7.

This patchset updates the LZ4 compression module to a version based on
LZ4 v1.7.3 allowing to use the fast compression algorithm aka LZ4 fast
which provides an "acceleration" parameter as a tradeoff between high
compression ratio and high compression speed.

We want to use LZ4 fast in order to support compression in lustre and
(mostly, based on that) investigate data reduction techniques in behalf
of storage systems.

Also, it will be useful for other users of LZ4 compression, as with LZ4
fast it is possible to enable applications to use fast and/or high
compression depending on the usecase.  For instance, ZRAM is offering a
LZ4 backend and could benefit from an updated LZ4 in the kernel.

LZ4 homepage: http://www.lz4.org/
LZ4 source repository: https://github.com/lz4/lz4 Source version: 1.7.3

Benchmark (taken from [1], Core i5-4300U @1.9GHz):
----------------|--------------|----------------|----------
Compressor      | Compression  | Decompression  | Ratio
----------------|--------------|----------------|----------
memcpy          |  4200 MB/s   |  4200 MB/s     | 1.000
LZ4 fast 50     |  1080 MB/s   |  2650 MB/s     | 1.375
LZ4 fast 17     |   680 MB/s   |  2220 MB/s     | 1.607
LZ4 fast 5      |   475 MB/s   |  1920 MB/s     | 1.886
LZ4 default     |   385 MB/s   |  1850 MB/s     | 2.101

[1] http://fastcompression.blogspot.de/2015/04/sampling-or-faster-lz4.html

[PATCH 1/5] lib: Update LZ4 compressor module
[PATCH 2/5] lib/decompress_unlz4: Change module to work with new LZ4 module version
[PATCH 3/5] crypto: Change LZ4 modules to work with new LZ4 module version
[PATCH 4/5] fs/pstore: fs/squashfs: Change usage of LZ4 to work with new LZ4 version
[PATCH 5/5] lib/lz4: Remove back-compat wrappers

This patch (of 5):

Update the LZ4 kernel module to LZ4 v1.7.3 by Yann Collet.  The kernel
module is inspired by the previous work by Chanho Min.  The updated LZ4
module will not break existing code since the patchset contains
appropriate changes.

API changes:

New method LZ4_compress_fast which differs from the variant available in
kernel by the new acceleration parameter, allowing to trade compression
ratio for more compression speed and vice versa.

LZ4_decompress_fast is the respective decompression method, featuring a
very fast decoder (multiple GB/s per core), able to reach RAM speed in
multi-core systems.  The decompressor allows to decompress data
compressed with LZ4 fast as well as the LZ4 HC (high compression)
algorithm.

Also the useful functions LZ4_decompress_safe_partial and
LZ4_compress_destsize were added.  The latter reverses the logic by
trying to compress as much data as possible from source to dest while
the former aims to decompress partial blocks of data.

A bunch of streaming functions were also added which allow
compressig/decompressing data in multiple steps (so called "streaming
mode").

The methods lz4_compress and lz4_decompress_unknownoutputsize are now
known as LZ4_compress_default respectivley LZ4_decompress_safe.  The old
methods will be removed since there's no callers left in the code.

[arnd@arndb.de: fix KERNEL_LZ4 support]
  Link: http://lkml.kernel.org/r/20170208211946.2839649-1-arnd@arndb.de
[akpm@linux-foundation.org: simplify]
[akpm@linux-foundation.org: fix the simplification]
[4sschmid@informatik.uni-hamburg.de: fix performance regressions]
  Link: http://lkml.kernel.org/r/1486898178-17125-2-git-send-email-4sschmid@informatik.uni-hamburg.de
[4sschmid@informatik.uni-hamburg.de: v8]
  Link: http://lkml.kernel.org/r/1487182598-15351-2-git-send-email-4sschmid@informatik.uni-hamburg.de
Link: http://lkml.kernel.org/r/1486321748-19085-2-git-send-email-4sschmid@informatik.uni-hamburg.de
Signed-off-by: Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Cc: Bongkyu Kim <bongkyu.kim@lge.com>
Cc: Rui Salvaterra <rsalvaterra@gmail.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky@gmail.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: Herbert Xu <herbert@gondor.apana.org.au>
Cc: David S. Miller <davem@davemloft.net>
Cc: Anton Vorontsov <anton@enomsg.org>
Cc: Colin Cross <ccross@android.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Tony Luck <tony.luck@intel.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2017-02-24 17:46:57 -08:00

551 lines
15 KiB
C

/*
* LZ4 - Fast LZ compression algorithm
* Copyright (C) 2011 - 2016, Yann Collet.
* BSD 2 - Clause License (http://www.opensource.org/licenses/bsd - license.php)
* Redistribution and use in source and binary forms, with or without
* modification, are permitted provided that the following conditions are
* met:
* * Redistributions of source code must retain the above copyright
* notice, this list of conditions and the following disclaimer.
* * Redistributions in binary form must reproduce the above
* copyright notice, this list of conditions and the following disclaimer
* in the documentation and/or other materials provided with the
* distribution.
* THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
* "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
* LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
* A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
* OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
* SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
* LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
* DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
* THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
* (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
* OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
* You can contact the author at :
* - LZ4 homepage : http://www.lz4.org
* - LZ4 source repository : https://github.com/lz4/lz4
*
* Changed for kernel usage by:
* Sven Schmidt <4sschmid@informatik.uni-hamburg.de>
*/
/*-************************************
* Dependencies
**************************************/
#include <linux/lz4.h>
#include "lz4defs.h"
#include <linux/init.h>
#include <linux/module.h>
#include <linux/kernel.h>
#include <asm/unaligned.h>
/*-*****************************
* Decompression functions
*******************************/
/* LZ4_decompress_generic() :
* This generic decompression function cover all use cases.
* It shall be instantiated several times, using different sets of directives
* Note that it is important this generic function is really inlined,
* in order to remove useless branches during compilation optimization.
*/
static FORCE_INLINE int LZ4_decompress_generic(
const char * const source,
char * const dest,
int inputSize,
/*
* If endOnInput == endOnInputSize,
* this value is the max size of Output Buffer.
*/
int outputSize,
/* endOnOutputSize, endOnInputSize */
int endOnInput,
/* full, partial */
int partialDecoding,
/* only used if partialDecoding == partial */
int targetOutputSize,
/* noDict, withPrefix64k, usingExtDict */
int dict,
/* == dest when no prefix */
const BYTE * const lowPrefix,
/* only if dict == usingExtDict */
const BYTE * const dictStart,
/* note : = 0 if noDict */
const size_t dictSize
)
{
/* Local Variables */
const BYTE *ip = (const BYTE *) source;
const BYTE * const iend = ip + inputSize;
BYTE *op = (BYTE *) dest;
BYTE * const oend = op + outputSize;
BYTE *cpy;
BYTE *oexit = op + targetOutputSize;
const BYTE * const lowLimit = lowPrefix - dictSize;
const BYTE * const dictEnd = (const BYTE *)dictStart + dictSize;
const unsigned int dec32table[] = { 0, 1, 2, 1, 4, 4, 4, 4 };
const int dec64table[] = { 0, 0, 0, -1, 0, 1, 2, 3 };
const int safeDecode = (endOnInput == endOnInputSize);
const int checkOffset = ((safeDecode) && (dictSize < (int)(64 * KB)));
/* Special cases */
/* targetOutputSize too high => decode everything */
if ((partialDecoding) && (oexit > oend - MFLIMIT))
oexit = oend - MFLIMIT;
/* Empty output buffer */
if ((endOnInput) && (unlikely(outputSize == 0)))
return ((inputSize == 1) && (*ip == 0)) ? 0 : -1;
if ((!endOnInput) && (unlikely(outputSize == 0)))
return (*ip == 0 ? 1 : -1);
/* Main Loop : decode sequences */
while (1) {
size_t length;
const BYTE *match;
size_t offset;
/* get literal length */
unsigned int const token = *ip++;
length = token>>ML_BITS;
if (length == RUN_MASK) {
unsigned int s;
do {
s = *ip++;
length += s;
} while (likely(endOnInput
? ip < iend - RUN_MASK
: 1) & (s == 255));
if ((safeDecode)
&& unlikely(
(size_t)(op + length) < (size_t)(op))) {
/* overflow detection */
goto _output_error;
}
if ((safeDecode)
&& unlikely(
(size_t)(ip + length) < (size_t)(ip))) {
/* overflow detection */
goto _output_error;
}
}
/* copy literals */
cpy = op + length;
if (((endOnInput) && ((cpy > (partialDecoding ? oexit : oend - MFLIMIT))
|| (ip + length > iend - (2 + 1 + LASTLITERALS))))
|| ((!endOnInput) && (cpy > oend - WILDCOPYLENGTH))) {
if (partialDecoding) {
if (cpy > oend) {
/*
* Error :
* write attempt beyond end of output buffer
*/
goto _output_error;
}
if ((endOnInput)
&& (ip + length > iend)) {
/*
* Error :
* read attempt beyond
* end of input buffer
*/
goto _output_error;
}
} else {
if ((!endOnInput)
&& (cpy != oend)) {
/*
* Error :
* block decoding must
* stop exactly there
*/
goto _output_error;
}
if ((endOnInput)
&& ((ip + length != iend)
|| (cpy > oend))) {
/*
* Error :
* input must be consumed
*/
goto _output_error;
}
}
memcpy(op, ip, length);
ip += length;
op += length;
/* Necessarily EOF, due to parsing restrictions */
break;
}
LZ4_wildCopy(op, ip, cpy);
ip += length;
op = cpy;
/* get offset */
offset = LZ4_readLE16(ip);
ip += 2;
match = op - offset;
if ((checkOffset) && (unlikely(match < lowLimit))) {
/* Error : offset outside buffers */
goto _output_error;
}
/* costs ~1%; silence an msan warning when offset == 0 */
LZ4_write32(op, (U32)offset);
/* get matchlength */
length = token & ML_MASK;
if (length == ML_MASK) {
unsigned int s;
do {
s = *ip++;
if ((endOnInput) && (ip > iend - LASTLITERALS))
goto _output_error;
length += s;
} while (s == 255);
if ((safeDecode)
&& unlikely(
(size_t)(op + length) < (size_t)op)) {
/* overflow detection */
goto _output_error;
}
}
length += MINMATCH;
/* check external dictionary */
if ((dict == usingExtDict) && (match < lowPrefix)) {
if (unlikely(op + length > oend - LASTLITERALS)) {
/* doesn't respect parsing restriction */
goto _output_error;
}
if (length <= (size_t)(lowPrefix - match)) {
/*
* match can be copied as a single segment
* from external dictionary
*/
memmove(op, dictEnd - (lowPrefix - match),
length);
op += length;
} else {
/*
* match encompass external
* dictionary and current block
*/
size_t const copySize = (size_t)(lowPrefix - match);
size_t const restSize = length - copySize;
memcpy(op, dictEnd - copySize, copySize);
op += copySize;
if (restSize > (size_t)(op - lowPrefix)) {
/* overlap copy */
BYTE * const endOfMatch = op + restSize;
const BYTE *copyFrom = lowPrefix;
while (op < endOfMatch)
*op++ = *copyFrom++;
} else {
memcpy(op, lowPrefix, restSize);
op += restSize;
}
}
continue;
}
/* copy match within block */
cpy = op + length;
if (unlikely(offset < 8)) {
const int dec64 = dec64table[offset];
op[0] = match[0];
op[1] = match[1];
op[2] = match[2];
op[3] = match[3];
match += dec32table[offset];
memcpy(op + 4, match, 4);
match -= dec64;
} else {
LZ4_copy8(op, match);
match += 8;
}
op += 8;
if (unlikely(cpy > oend - 12)) {
BYTE * const oCopyLimit = oend - (WILDCOPYLENGTH - 1);
if (cpy > oend - LASTLITERALS) {
/*
* Error : last LASTLITERALS bytes
* must be literals (uncompressed)
*/
goto _output_error;
}
if (op < oCopyLimit) {
LZ4_wildCopy(op, match, oCopyLimit);
match += oCopyLimit - op;
op = oCopyLimit;
}
while (op < cpy)
*op++ = *match++;
} else {
LZ4_copy8(op, match);
if (length > 16)
LZ4_wildCopy(op + 8, match + 8, cpy);
}
op = cpy; /* correction */
}
/* end of decoding */
if (endOnInput) {
/* Nb of output bytes decoded */
return (int) (((char *)op) - dest);
} else {
/* Nb of input bytes read */
return (int) (((const char *)ip) - source);
}
/* Overflow error detected */
_output_error:
return -1;
}
int LZ4_decompress_safe(const char *source, char *dest,
int compressedSize, int maxDecompressedSize)
{
return LZ4_decompress_generic(source, dest, compressedSize,
maxDecompressedSize, endOnInputSize, full, 0,
noDict, (BYTE *)dest, NULL, 0);
}
int LZ4_decompress_safe_partial(const char *source, char *dest,
int compressedSize, int targetOutputSize, int maxDecompressedSize)
{
return LZ4_decompress_generic(source, dest, compressedSize,
maxDecompressedSize, endOnInputSize, partial,
targetOutputSize, noDict, (BYTE *)dest, NULL, 0);
}
int LZ4_decompress_fast(const char *source, char *dest, int originalSize)
{
return LZ4_decompress_generic(source, dest, 0, originalSize,
endOnOutputSize, full, 0, withPrefix64k,
(BYTE *)(dest - 64 * KB), NULL, 64 * KB);
}
int LZ4_setStreamDecode(LZ4_streamDecode_t *LZ4_streamDecode,
const char *dictionary, int dictSize)
{
LZ4_streamDecode_t_internal *lz4sd = (LZ4_streamDecode_t_internal *) LZ4_streamDecode;
lz4sd->prefixSize = (size_t) dictSize;
lz4sd->prefixEnd = (const BYTE *) dictionary + dictSize;
lz4sd->externalDict = NULL;
lz4sd->extDictSize = 0;
return 1;
}
/*
* *_continue() :
* These decoding functions allow decompression of multiple blocks
* in "streaming" mode.
* Previously decoded blocks must still be available at the memory
* position where they were decoded.
* If it's not possible, save the relevant part of
* decoded data into a safe buffer,
* and indicate where it stands using LZ4_setStreamDecode()
*/
int LZ4_decompress_safe_continue(LZ4_streamDecode_t *LZ4_streamDecode,
const char *source, char *dest, int compressedSize, int maxOutputSize)
{
LZ4_streamDecode_t_internal *lz4sd = &LZ4_streamDecode->internal_donotuse;
int result;
if (lz4sd->prefixEnd == (BYTE *)dest) {
result = LZ4_decompress_generic(source, dest,
compressedSize,
maxOutputSize,
endOnInputSize, full, 0,
usingExtDict, lz4sd->prefixEnd - lz4sd->prefixSize,
lz4sd->externalDict,
lz4sd->extDictSize);
if (result <= 0)
return result;
lz4sd->prefixSize += result;
lz4sd->prefixEnd += result;
} else {
lz4sd->extDictSize = lz4sd->prefixSize;
lz4sd->externalDict = lz4sd->prefixEnd - lz4sd->extDictSize;
result = LZ4_decompress_generic(source, dest,
compressedSize, maxOutputSize,
endOnInputSize, full, 0,
usingExtDict, (BYTE *)dest,
lz4sd->externalDict, lz4sd->extDictSize);
if (result <= 0)
return result;
lz4sd->prefixSize = result;
lz4sd->prefixEnd = (BYTE *)dest + result;
}
return result;
}
int LZ4_decompress_fast_continue(LZ4_streamDecode_t *LZ4_streamDecode,
const char *source, char *dest, int originalSize)
{
LZ4_streamDecode_t_internal *lz4sd = &LZ4_streamDecode->internal_donotuse;
int result;
if (lz4sd->prefixEnd == (BYTE *)dest) {
result = LZ4_decompress_generic(source, dest, 0, originalSize,
endOnOutputSize, full, 0,
usingExtDict,
lz4sd->prefixEnd - lz4sd->prefixSize,
lz4sd->externalDict, lz4sd->extDictSize);
if (result <= 0)
return result;
lz4sd->prefixSize += originalSize;
lz4sd->prefixEnd += originalSize;
} else {
lz4sd->extDictSize = lz4sd->prefixSize;
lz4sd->externalDict = lz4sd->prefixEnd - lz4sd->extDictSize;
result = LZ4_decompress_generic(source, dest, 0, originalSize,
endOnOutputSize, full, 0,
usingExtDict, (BYTE *)dest,
lz4sd->externalDict, lz4sd->extDictSize);
if (result <= 0)
return result;
lz4sd->prefixSize = originalSize;
lz4sd->prefixEnd = (BYTE *)dest + originalSize;
}
return result;
}
/*
* Advanced decoding functions :
* *_usingDict() :
* These decoding functions work the same as "_continue" ones,
* the dictionary must be explicitly provided within parameters
*/
static FORCE_INLINE int LZ4_decompress_usingDict_generic(const char *source,
char *dest, int compressedSize, int maxOutputSize, int safe,
const char *dictStart, int dictSize)
{
if (dictSize == 0)
return LZ4_decompress_generic(source, dest,
compressedSize, maxOutputSize, safe, full, 0,
noDict, (BYTE *)dest, NULL, 0);
if (dictStart + dictSize == dest) {
if (dictSize >= (int)(64 * KB - 1))
return LZ4_decompress_generic(source, dest,
compressedSize, maxOutputSize, safe, full, 0,
withPrefix64k, (BYTE *)dest - 64 * KB, NULL, 0);
return LZ4_decompress_generic(source, dest, compressedSize,
maxOutputSize, safe, full, 0, noDict,
(BYTE *)dest - dictSize, NULL, 0);
}
return LZ4_decompress_generic(source, dest, compressedSize,
maxOutputSize, safe, full, 0, usingExtDict,
(BYTE *)dest, (const BYTE *)dictStart, dictSize);
}
int LZ4_decompress_safe_usingDict(const char *source, char *dest,
int compressedSize, int maxOutputSize,
const char *dictStart, int dictSize)
{
return LZ4_decompress_usingDict_generic(source, dest,
compressedSize, maxOutputSize, 1, dictStart, dictSize);
}
int LZ4_decompress_fast_usingDict(const char *source, char *dest,
int originalSize, const char *dictStart, int dictSize)
{
return LZ4_decompress_usingDict_generic(source, dest, 0,
originalSize, 0, dictStart, dictSize);
}
/*-******************************
* For backwards compatibility
********************************/
int lz4_decompress_unknownoutputsize(const unsigned char *src,
size_t src_len, unsigned char *dest, size_t *dest_len) {
*dest_len = LZ4_decompress_safe(src, dest,
src_len, *dest_len);
/*
* Prior lz4_decompress_unknownoutputsize will return
* 0 for success and a negative result for error
* new LZ4_decompress_safe returns
* - the length of data read on success
* - and also a negative result on error
* meaning when result > 0, we just return 0 here
*/
if (src_len > 0)
return 0;
else
return -1;
}
int lz4_decompress(const unsigned char *src, size_t *src_len,
unsigned char *dest, size_t actual_dest_len) {
*src_len = LZ4_decompress_fast(src, dest, actual_dest_len);
/*
* Prior lz4_decompress will return
* 0 for success and a negative result for error
* new LZ4_decompress_fast returns
* - the length of data read on success
* - and also a negative result on error
* meaning when result > 0, we just return 0 here
*/
if (*src_len > 0)
return 0;
else
return -1;
}
#ifndef STATIC
EXPORT_SYMBOL(LZ4_decompress_safe);
EXPORT_SYMBOL(LZ4_decompress_safe_partial);
EXPORT_SYMBOL(LZ4_decompress_fast);
EXPORT_SYMBOL(LZ4_setStreamDecode);
EXPORT_SYMBOL(LZ4_decompress_safe_continue);
EXPORT_SYMBOL(LZ4_decompress_fast_continue);
EXPORT_SYMBOL(LZ4_decompress_safe_usingDict);
EXPORT_SYMBOL(LZ4_decompress_fast_usingDict);
EXPORT_SYMBOL(lz4_decompress_unknownoutputsize);
EXPORT_SYMBOL(lz4_decompress);
MODULE_LICENSE("Dual BSD/GPL");
MODULE_DESCRIPTION("LZ4 decompressor");
#endif