LZ4压缩的文本大于未压缩的文本 [英] LZ4 compressed text is larger than uncompressed
问题描述
我已经阅读过lz4算法非常快并且具有很好的压缩率.但是在我的测试应用中,压缩文本比源文本大.有什么问题吗?
I have read that lz4 algorithm is very fast and has pretty good compression. But in my test app compressed text is larger than the source text. What is the problem?
srand(time(NULL));
std::string text;
for (int i = 0; i < 65535; ++i)
text.push_back((char)(0 + rand() % 256));
cout << "Text size: " << text.size() << endl;
char *compressedData = new char[text.size() * 2];
int compressedSize = LZ4_compress(text.c_str(), text.size(), compressedData);
cout << "Compressed size: " << compressedSize << endl;
我也尝试了LZ4_compress,但是结果是一样的.但是,如果我生成具有相同符号的字符串或使用两个不同的符号说,则存在压缩.
I also tried LZ4_compress, but result is the same. But if I generate string with same symbols or say with two different symbols, then compression is present.
推荐答案
看看 LZ4算法的说明.它引用压缩文本中的公共子字符串.它使用已经输出的文本作为字典.
Have a look at a description of the LZ4 algorithm. It references common substrings within the compressed text. It uses the already output text as a dictionary.
没有重复任何长度序列的随机文本或任何其他材料将无法很好地压缩它.对于这种纯文本,位压缩算法可能会做得更好.
Random text or any other material without repeating sequences of any length will not compress well using it. For that plaintext, a bit compression algorithm will probably do better.
这篇关于LZ4压缩的文本大于未压缩的文本的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!