使用apache common compress/org.tukaani.xz在Java中解码LZMA压缩zip文件的问题 [英] Issue with decoding LZMA compress zip file in java using apache common compress/org.tukaani.xz

查看:205
本文介绍了使用apache common compress/org.tukaani.xz在Java中解码LZMA压缩zip文件的问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在尝试解码LZMA压缩xls文件时,获取 org.tukaani.xz.UnsupportedOptionsException:未压缩大小太大错误.非LZMA文件可以解压缩/解码而没有任何问题.两种情况都压缩了相同的xls文件.

Getting org.tukaani.xz.UnsupportedOptionsException: Uncompressed size is too big error while trying to decode LZMA compress xls file. Whereas non LZMA files getting unpack/decode without any issue. Both the cases same xls file being compressed.

我正在使用Apache commons compress和org.tukaani.xz.

I am using Apache commons compress and org.tukaani.xz.

供参考的示例代码

package com.concept.utilities.zip;

import java.io.File;
import java.io.IOException;
import java.io.InputStream;

import org.apache.commons.compress.archivers.zip.ZipArchiveEntry;
import org.apache.commons.compress.archivers.zip.ZipFile;
import org.apache.commons.compress.compressors.lzma.LZMACompressorInputStream;

public class ApacheComm {

    public void extractLZMAZip(File zipFile, String compressFileName, String destFolder) {

        ZipFile zip = null;
        try {

            zip = new ZipFile(zipFile);
            ZipArchiveEntry zipArchiveEntry = zip.getEntry(compressFileName);
            if (null != zipArchiveEntry) {
                String name = zipArchiveEntry.getName();

                // InputStream is = zip.getInputStream(zipArchiveEntry);
                InputStream israw = zip.getRawInputStream(zipArchiveEntry);

                LZMACompressorInputStream lzma = new LZMACompressorInputStream(israw);
            }

        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            if (null != zip)
                ZipFile.closeQuietly(zip);
        }
    }

    public static void main(String[] args) throws IOException {

        ApacheComm c = new ApacheComm();
        try {
            c.extractLZMAZip(new File("H:\\archives\\rollLZMA.zip"), "roll.xls", "H:\\archives\\");
        } catch (Exception e) {
            e.printStackTrace();
        }

    }

}

错误

org.tukaani.xz.UnsupportedOptionsException: Uncompressed size is too big
    at org.tukaani.xz.LZMAInputStream.initialize(Unknown Source)
    at org.tukaani.xz.LZMAInputStream.<init>(Unknown Source)
    at org.apache.commons.compress.compressors.lzma.LZMACompressorInputStream.<init>(LZMACompressorInputStream.java:50)
    at com.concept.utilities.zip.ApacheComm.extractLZMAZip(ApacheComm.java:209)
    at com.concept.utilities.zip.ApacheComm.main(ApacheComm.java:224)

我错过了什么吗?我还有其他方法可以使用压缩方法= LZMA来解码 zip文件

Am I missing something? Is there any other way I can decode zip file with compression method = LZMA

推荐答案

您的代码无法正常工作的原因是,与普通的压缩LZMA文件相比,Zip LZMA压缩的数据段具有不同的头.

The reason your code isn't working, is that Zip LZMA compressed data segments have a different header compared to normal compressed LZMA files.

您可以在 https://pkware.cachefly.net/webdocs中阅读规范./casestudies/APPNOTE.TXT (4.4.4通用位标志,5.8 LZMA-方法14),但引用重要部分:

You can read the specifications at https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT (4.4.4 general purpose bit flag, 5.8 LZMA - Method 14), but to quote the important part:

5.8.5 [...] LZMA压缩数据段将由LZMA属性标题和紧随其后的LZMA压缩数据组成,如下所示:

5.8.5 [...] The LZMA Compressed Data Segment will consist of an LZMA Properties Header followed by the LZMA Compressed Data as shown:

[LZMA properties header for file 1]
[LZMA compressed data for file 1]

[...]

5.8.8 LZMA属性标题中的属性信息的存储字段如下:

5.8.8 Storage fields for the property information within the LZMA Properties Header are as follows:

LZMA Version Information 2 bytes
LZMA Properties Size 2 bytes
LZMA Properties Data variable, defined by "LZMA Properties Size"

5.8.8.1 LZMA版本信息-此字段标识用于压缩文件的LZMA SDK版本.第一个字节将存储LZMA SDK的主版本号,第二个字节将存储次要号.

5.8.8.1 LZMA Version Information - this field identifies which version of the LZMA SDK was used to compress a file. The first byte will store the major version number of the LZMA SDK and the second byte will store the minor number.

5.8.8.2 LZMA属性大小-此字段定义剩余属性数据的大小.通常,此大小应由SDK的版本确定.包含此大小字段是为了方便,并避免将来由于此压缩算法的更改而引起的任何歧义.

5.8.8.2 LZMA Properties Size - this field defines the size of the remaining property data. Typically this size SHOULD be determined by the version of the SDK. This size field is included as a convenience and to help avoid any ambiguity arising in the future due to changes in this compression algorithm.

5.8.8.3 LZMA属性数据-此可变大小的字段记录了LZMA SDK定义的解压缩器所需的值.存储在该字段中的数据应该使用由"LZMA版本信息"定义的SDK版本中的WriteCoderProperties()获得.字段.

5.8.8.3 LZMA Property Data - this variable sized field records the required values for the decompressor as defined by the LZMA SDK. The data stored in this field SHOULD be obtained using the WriteCoderProperties() in the version of the SDK defined by the "LZMA Version Information" field.

代码示例:

import org.apache.commons.compress.archivers.zip.ZipArchiveEntry;
import org.apache.commons.compress.archivers.zip.ZipFile;
import org.apache.commons.compress.archivers.zip.ZipMethod;
import org.apache.commons.io.IOUtils;
import org.tukaani.xz.LZMAInputStream;

import java.io.IOException;
import java.io.InputStream;
import java.nio.ByteBuffer;
import java.nio.ByteOrder;

public class ApacheComm
{
    public InputStream getInputstreamForEntry(ZipFile zipFile, ZipArchiveEntry ze) throws IOException
    {
        if (zipFile.canReadEntryData(ze))
        {
            return zipFile.getInputStream(ze);
        } else if (ze.getMethod() == ZipMethod.LZMA.getCode()) {
            InputStream inputStream = zipFile.getRawInputStream(ze);
            ByteBuffer buffer = ByteBuffer.wrap(IOUtils.readFully(inputStream, 9))
                    .order(ByteOrder.LITTLE_ENDIAN);

            // Lzma sdk version used to compress this data
            int majorVersion = buffer.get();
            int minorVersion = buffer.get();

            // Byte count of the following data represent as an unsigned short.
            // Should be = 5 (propByte + dictSize) in all versions
            int size = buffer.getShort() & 0xffff;
            if (size != 5)
                throw new UnsupportedOperationException();

            byte propByte = buffer.get();

            // Dictionary size is an unsigned 32-bit little endian integer.
            int dictSize = buffer.getInt();

            long uncompressedSize;
            if ((ze.getRawFlag() & (1 << 1)) != 0)
            {
                // If the entry uses EOS marker, use -1 to indicate
                uncompressedSize = -1;
            } else {
                uncompressedSize = ze.getSize();
            }

            return new LZMAInputStream(inputStream, uncompressedSize, propByte, dictSize);
        } else {
            throw new UnsupportedOperationException();
        }
    }
}

这篇关于使用apache common compress/org.tukaani.xz在Java中解码LZMA压缩zip文件的问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆