MATLAB中的内存映射文件? [英] Memory map file in MATLAB?

查看:365
本文介绍了MATLAB中的内存映射文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我决定使用memmapfile,因为我的数据(通常为30Gb至60Gb)太大而无法容纳计算机的内存.

I have decided to use memmapfile because my data (typically 30Gb to 60Gb) is too big to fit in a computer's memory.

我的数据文件包含两列数据,分别与两个传感器的输出相对应,并且都具有.bin和.txt格式.

My data files consist two columns of data that correspond to the outputs of two sensors and I have them in both .bin and .txt formats.

m=memmapfile('G:\E-Stress Research\Data\2013-12-18\LD101_3\EPS/LD101_3.bin','format','int32')
m.data(1)

我使用上面的代码将数据映射到变量"m",但是我不知道要使用哪种数据格式(int8','int16','int32','int64','uint8',' uint16","uint32","uint64","single"和"double"). 实际上,我尝试了MATLAB支持的所有列出的数据格式,但是当我使用m.data(index number)时,我从没有得到我期望的一对数字(2列数据),该数字也会是根据我使用的格式而有所不同.

I used the above code to memory map my data to a variable "m" but I have no idea what data format to use (int8', 'int16', 'int32', 'int64','uint8', 'uint16', 'uint32', 'uint64', 'single', and 'double'). In fact I tried all of the data formats listed that MATLAB supports, but when I used the m.data(index number) I never get a pair of numbers (2 columns of data) which is what I expected, also the number will be different depending on the format I used.

如果有人有memmapfile的经验,请帮助我.

If anyone has experience with memmapfile please help me.

此处是我的数据文件的一些较小版本,以便人们可以理解我的数据结构化:

Here are some smaller versions of my data files so people can understand how my data is structured:

欢呼 詹姆斯

推荐答案

memmapfile设计用于读取二进制文件,因此,您在处理文本文件时会遇到麻烦.其中的数据是字符,因此您必须将它们读取为字符,然后将其解析为数字.详情请见下文.

memmapfile is designed for reading binary files, that's why you are having trouble with your text file. The data in there is characters, so you'll have to read them as characters and then parse them into numbers. More on that below.

二进制文件似乎不仅仅包含以二进制格式编写的浮点值流.我在文件中也看到标识符(字符串)和其他内容.读取的唯一希望是与创建二进制文件的设备的制造商联系,并询问他们如何读取此类文件.可能会有一个SDK,或者至少是格式的描述.您可能要调查一下,因为文本文件中的浮点数可能会被截断,即,与直接读取浮点数的二进制表示形式相比,您已经失去了精度.

The binary file appears to contain more than just a stream of floating point values written in binary format. I see identifiers (strings) and other things in the file as well. Your only hope of reading that is to contact the manufacturer of the device that created the binary file and ask them about how to read in such files. There'll probably be an SDK, or at least a description of the format. You might want to look into this as the floating point numbers in your text file might be truncated, i.e., you have lost precision compared to directly reading the binary representation of the floats.

好,那么如何使用memmapfile读取文件? 这篇文章提供了一些提示.

Ok, so how to read your file with memmapfile? This post provides some hints.

因此,首先我们以'uint8'的格式打开文件(请注意,没有'char'选项,因此,作为一种解决方法,我们将文件的内容读取为相同大小的数据类型):

So first we open your file as 'uint8' (note there is no 'char' option, so as a workaround we read the content of the file into a datatype of the same size):

m = memmapfile('RTL5_57.txt','Format','uint8'); % uint8 is default, you could leave that off

我们可以将通过uint8读取的数据转换为char来呈现为字符:

We can render the data read in as uint8 as characters by casting it to char:

c = char(m.Data(1:19)).' % read the first three lines. NB: transpose just for getting nice output, don't use it in your code
c = 
    0.398516    0.063440
    0.399611    0.063284
    0.398985    0.061253

由于文件中的每一行具有相同的长度(数字2 * 8个字符,制表符1个制表符,换行符= 19个字符2个字符),我们可以通过读取N*19值从文件中读取N行.因此,m.Data(1:19)为您提供第一行,m.Data(20:38),第二行以及m.Data(20:57)第二行和第三行.一次阅读尽可能多的内容.

As each line in your file has the same length (2*8 chars for the numbers, 1 tab and 2 chars for newline = 19 chars), we can read N lines from the file by reading N*19 values. So m.Data(1:19) gets you the first line, m.Data(20:38), the second line, and m.Data(20:57) the second and third lines. Read as much as you want at once.

然后,我们必须将读入的数据解析为浮点数:

Then we'll have to parse the read-in data into floating point numbers:

f = sscanf(c,'%f')
f =
    0.3985
    0.0634
    0.3996
    0.0633
    0.3990
    0.0613

现在剩下的就是将它们重塑为两列格式

All that's left now is to reshape them into your two column format

d = reshape(f,2,[]).'
d =
    0.3985    0.0634
    0.3996    0.0633
    0.3990    0.0613

比使用memmapfile 更简单的方法: 您无需使用memmapfile即可解决问题,我认为这会使事情变得更加复杂.您可以简单地使用fopen,然后使用fread:

Easier ways than using memmapfile: You don't need to use memmapfile to solve your problem, and I think it makes things more complicated. You can simply use fopen followed by fread:

fid = fopen('RTL5_57.txt');
c = fread(fid,Nlines*19,'*char');
% now sscanf and reshape as above
% NB: one can read the values the text file directly with f = fscanf(fid,'%f',Nlines*19).
% However, in testing, I have found calling fread followed by sscanf to be faster
% which will make a significant difference when reading such large files.

使用此方法,您可以一次读取Nlines对值,对其进行处理,然后只需再次调用fread即可读取下一个Nlines. fread会记住它在文件中的位置(与fscanf一样),因此只需使用相同的调用即可获得下一行.这样就很容易编写一个循环来处理整个文件,并使用feof(fid)进行测试(如果您位于文件末尾).

Using this you can read Nlines pairs of values at a time, process them and simply call fread again to read the next Nlines. fread remembers where it is in the file (as does fscanf), so simply use same call to get next lines. Its thus easy to write a loop to process the whole file, testing with feof(fid) if you are at the end of the file.

建议此处:使用textscan.要稍微修改其示例代码:

An even easier way is suggested here: use textscan. To slightly adapt their example code:

Nlines = 10000;

% describe the format of the data
% for more information, see the textscan reference page
format = '%f\t%f';

fid = fopen('RTL5_57.txt');

while ~feof(fid)
   C = textscan(fid, format, Nlines, 'CollectOutput', true);
   d = C{1};  % immediately clear C at this point if you need the memory! 
   % process d
end

fclose(fid);

请再次注意,fread后跟sscanf将是最快的.但是请注意,一旦文本文件中的一行与您的格式不完全匹配,fread方法就会消失.另一方面,textscan宽容空白更改,因此更健壮.

Note again however that the fread followed by sscanf will be fastest. Note however that the fread method would die as soon as there is one line in the text file that doesn't exactly match your format. textscan is forgiving of whitespace changes on the other hand and thus more robust.

这篇关于MATLAB中的内存映射文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆