从Zip文件中的TXT文件中读取数据,而无需在Matlab中提取内容 [英] Read the data from TXT file inside Zip File without extracting the contents in Matlab

查看:351
本文介绍了从Zip文件中的TXT文件中读取数据,而无需在Matlab中提取内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在以zip压缩的txt文件中具有制表符分隔的ascii数据(并且zip可能包含也可能不包含其他文件).我想在不解压缩zip文件的情况下将这些数据读入矩阵.

I have tab delimited ascii data in txt files which are zip compressed (and the zip may or may not contain other files). I would like to read this data into a matrix without uncompressing the zip files.

之前有一些类似的@matlab/@java帖子:

There were a few similar @matlab / @java posts earlier:

在Zip文件中读取CSV文件的数据,而无需在Matlab中提取内容

从matlab中的zip提取特定文件

从Zip文件内部的文件中读取内容

由于上述原因,我已经走到了这一步-我可以识别zip内的.txt,但不知道如何阅读.第一个示例:

I have gotten this far thanks to the above - I can identify the .txt inside the zip, but don't know how to actually read its contents. First example:

zipFilename = 'example.zip';
zipJavaFile = java.io.File(zipFilename);
zipFile=org.apache.tools.zip.ZipFile(zipJavaFile);
entries=zipFile.getEntries;
cnt=1;
while entries.hasMoreElements
    tempObj=entries.nextElement;
    file{cnt,1}=tempObj.getName.toCharArray';
    cnt=cnt+1;
end
ind=regexp(file,'$*.xml$');
ind=find(~cellfun(@isempty,ind));
file=file(ind);
file = cellfun(@(x) fullfile('.',x),file,'UniformOutput',false);
% Now Operate Any thing on File.
zipFile.close

但是,我没有找到有关如何"operate anything on file"的示例.我可以在zip文件中提取路径,但不知道如何实际读取该txt文件的内容. (我希望直接将其内容读入内存-一个矩阵-如果可能的话,不进行提取.)

HOWEVER, I found no example as to how to "operate anything on file". I can extract the path within the zip file, but don't know how to actually read the contents of this txt file. (I wish to directly read its contents into memory -- a matrix --, without extraction, if possible.)

另一个例子是

zipFilename = 'example.zip';
zipFile = org.apache.tools.zip.ZipFile(zipFilename);
entries = zipFile.getEntries;
while entries.hasMoreElements
    entry = entries.nextElement;
    entryName = char(entry.getName);
    [~,~,ext] = fileparts(entryName);
    if strcmp(ext,'.txt')
        inputStream  = zipFile.getInputStream(entry);
        %Read the contents of the file
        inputStream.close;
    end
end
zipFile.close

原始示例包含提取文件的代码,但我只想直接将其读取到内存中.同样,我不知道如何精确地使用此inputStream.

The original example contained code to extract the file, but I merely want to read it directly into memory. Again, I don't know how exactly to work with this inputStream.

有人可以给我一个MWE的建议吗?

Could anyone give me a suggestion with a MWE?

推荐答案

可能有点晚了,但是也许有人可以使用它: (该代码已在Matlab R2018a中进行了测试)

It might be a little late, but maybe someone can use it: (the code was tested in Matlab R2018a)

zipFilename = 'example.zip';
zipFile = org.apache.tools.zip.ZipFile(zipFilename);
entries = zipFile.getEntries;
while entries.hasMoreElements
    entry = entries.nextElement;
    entryName = char(entry.getName);
    [~,~,ext] = fileparts(entryName);
    if strcmp(ext,'.txt')
        inputStream  = zipFile.getInputStream(entry);
        %Read the contents of the file

        buffer = java.io.ByteArrayOutputStream();
        org.apache.commons.io.IOUtils.copy(inputStream, buffer);
        data = char(typecast(buffer.toByteArray(), 'uint8')');

        inputStream.close;
    end
end
zipFile.close

这篇关于从Zip文件中的TXT文件中读取数据,而无需在Matlab中提取内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆