随机搜索在7z单文件存档 [英] random seek in 7z single file archive

查看:206
本文介绍了随机搜索在7z单文件存档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以对非常庞大的文件进行随机访问(大量查找),由7zip压缩?

Is it possible to do random access (a lot of seeks) to very huge file, compressed by 7zip?

原始文件非常巨大(999gb xml)和我不能存储在解压缩格式(我没有这么多的可用空间)。因此,如果7z格式允许访问中间块而不解压缩所有块,那么我可以构建一个块开头的索引和对应的原始文件偏移量。

The original file is very huge (999gb xml) and I can't store it in unpacked format (i have no so much free space). So, if 7z format allows accessing to middle block without uncompressing all blocks before selected one, I can built an index of block beginning and corresponding original file offsets.

7z存档是

37 7A BC AF 27 1C 00 02 28 99 F1 9D 4A 46 D7 EA  // 7z archive version 2;crc; n.hfr offset
00 00 00 00 44 00 00 00 00 00 00 00 F4 56 CF 92  // n.hdr offset; n.hdr size=44. crc
00 1E 1B 48 A6 5B 0A 5A 5D DF 57 D8 58 1E E1 5F
71 BB C0 2D BD BF 5A 7C A2 B1 C7 AA B8 D0 F5 26
FD 09 33 6C 05 1E DF 71 C6 C5 BD C0 04 3A B6 29

更新:7z archiver说这个文件有一个单块数据,用LZMA算法压缩。测试时的解压缩速度为600 MB / s(未打包数据),只使用一个CPU内核。

UPDATE: 7z archiver says that this file has a single block of data, compressed with LZMA algorithm. Decompression speed on testing is 600 MB/s (of unpacked data), only one CPU core is used.

推荐答案

,但是如果你的问题是当前可用的二进制7zip命令行工具允许',答案是不幸的是没有
最好的是独立地压缩每个文件到存档,允许文件被检索
但是因为你想要压缩的是一个单一的(巨大的)文件,这个窍门将无法工作。

It's technically possible, but if your question is "does the currently available binary 7zip command line tool allows that', the answer is unfortunately no. The best it allows is to compress independantly each file into the archive, allowing the files to be retrieved directly. But since what you want to compress is a single (huge) file, this trick will not work.

恐怕唯一的方法是将您的文件分成小块,并将它们提供给LZMA编码器(包含在LZMA SDK中)。不幸的是需要一些编程技巧。

I'm afraid the only way is to chunk your file into small blocks, and to feed them to an LZMA encoder (included in LZMA SDK). Unfortunately that requires some programming skills.

注意:但是平凡的压缩算法可以在这里找到
主程序只是你想要的:将源文件切成小块,并将它们一个一个地送到压缩器(在这种情况下,LZ4)。解码器然后进行相反操作。它可以轻松地跳过所有的压缩块,直接到你想要检索的。
http://code.google.com/p /lz4/source/browse/trunk/lz4demo.c

Note : a technically inferior but trivial compression algorithm can be found here. The main program does just what you are looking for : cut the source file into small blocks, and feed them one by one to a compressor (in this case, LZ4). The decoder then does the reverse operation. It can easily skip all the compressed blocks and go straight to the one you want to retrieve. http://code.google.com/p/lz4/source/browse/trunk/lz4demo.c

这篇关于随机搜索在7z单文件存档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆