如何使用 apache commons 从 TAR 解压缩特定文件? [英] How uncompress a specific file from a TAR using apache commons?

查看:41
本文介绍了如何使用 apache commons 从 TAR 解压缩特定文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用 Apache Commons 1.4.1 库来解压缩.tar"文件.

I'm using the Apache Commons 1.4.1 library to uncompress ".tar" files.

问题:我不必提取所有文件.我必须从 tar 存档中的特定位置提取特定文件.我只需要提取几个 .xml 文件,其中 TAR 文件的大小约为 300 MB &解压整个内容很浪费资源.

Problem: I don't have to extract all files. I have to extract specific files from specific location inside a tar archive. i have to extract only few .xml files where as the size of the TAR file is around 300 MB & it is waste of resource in uncompressing the entire content.

我被困住了&困惑我是否必须进行嵌套目录比较还是有任何解决方法?

I am stuck up & confused whether i have to do a nested directory compare or is there is any way around?

注意:.XML(必需文件)的位置始终相同.

Note: location of the .XML(required files) is always same.

TAR 的结构是:

directory:E:\Root\data
 file:E:\Root\datasheet.txt
directory:E:\Root\map
     file:E:\Root\mapers.txt
directory:E:\Root\ui
     file:E:\Root\ui\capital.txt
     file:E:\Root\ui\info.txt
directory:E:\Root\ui\sales
     file:E:\Root\ui\sales\Reqest_01.xml
     file:E:\Root\ui\sales\Reqest_02.xml
     file:E:\Root\ui\sales\Reqest_03.xml
     file:E:\Root\ui\sales\Reqest_04.xml
directory:E:\Root\ui\sales\stores
directory:E:\Root\ui\stores
directory:E:\Root\urls
directory:E:\Root\urls\fullfilment
     file:E:\Root\urls\fullfilment\Cams_01.xml
     file:E:\Root\urls\fullfilment\Cams_02.xml
     file:E:\Root\urls\fullfilment\Cams_03.xml
     file:E:\Root\urls\fullfilment\Cams_04.xml
directory:E:\Root\urls\fullfilment\profile
directory:E:\Root\urls\fullfilment\registration
     file:E:\Root\urls\options.txt
directory:E:\Root\urls\profile

约束:我不能使用 JDK 7 &必须坚持使用 Apache 公共库.

Constraint: i cant use JDK 7 & have to stick with Apache commons library.

我目前的解决方案:

public static void untar(File[] files) throws Exception {
        String path = files[0].toString();
        File tarPath = new File(path);
        TarEntry entry;
        TarInputStream inputStream = null;
        FileOutputStream outputStream = null;
        try {
            inputStream = new TarInputStream(new FileInputStream(tarPath));
            while (null != (entry = inputStream.getNextEntry())) {
                int bytesRead;
                System.out.println("tarpath:" + tarPath.getName());
                System.out.println("Entry:" + entry.getName());
                String pathWithoutName = path.substring(0, path.indexOf(tarPath.getName()));
                System.out.println("pathname:" + pathWithoutName);
                if (entry.isDirectory()) {
                    File directory = new File(pathWithoutName + entry.getName());
                    directory.mkdir();
                    continue;
                }
                byte[] buffer = new byte[1024];
                outputStream = new FileOutputStream(pathWithoutName + entry.getName());
                while ((bytesRead = inputStream.read(buffer, 0, 1024)) > -1) {
                    outputStream.write(buffer, 0, bytesRead);
                }
                System.out.println("Extracted " + entry.getName());
            }

        }

推荐答案

TAR 文件格式设计为作为流写入或读取(即,从磁带驱动器写入或从磁带驱动器读取),并且没有集中的标头.所以不,没有办法通过读取整个文件来提取单个条目.

The TAR file format is designed to be written or read as a stream (ie, to/from a tape drive), and does not have a centralized header. So no, there's no way around reading the entire file to extract individual entries.

如果你想随机访问,你应该使用ZIP格式,并使用JDK的ZipFile打开.假设您有足够的虚拟内存,该文件将被内存映射,从而使随机访问非常快(我还没有查看如果无法进行内存映射,它是否会使用随机访问文件).

If you want random access, you should use the ZIP format, and open using the JDK's ZipFile. Assuming that you have enough virtual memory, the file will be memory-mapped, making random access very fast (I haven't looked to see if it will use a random-access file if unable to memory-map).

这篇关于如何使用 apache commons 从 TAR 解压缩特定文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆