MATLAB:将几个变量保存到"-v7.3" (HDF5).mat文件在使用& quot; -append& amp;"时似乎更快.旗帜.怎么会? [英] MATLAB: Saving several variables to "-v7.3" (HDF5) .mat-files seems to be faster when using the "-append" flag. How come?

查看:99
本文介绍了MATLAB:将几个变量保存到"-v7.3" (HDF5).mat文件在使用& quot; -append& amp;"时似乎更快.旗帜.怎么会?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意: 该问题处理的是2011年使用旧版MATLAB(R2009a)时发现的问题.根据2016年7月以下的更新,MATLAB中的问题/错误似乎不再存在(已通过R2016a测试;向下滚动至问题末尾以查看更新).

我正在使用MATLAB R2009b,我需要编写一个较大的脚本,该脚本将较大的.zip文件集的内容转换为v7.3 mat文件(具有基础HDF5-数据模型).读书还可以.问题在于保存.而且实际上没有问题.使用保存命令可以很好地保存我的文件.

I am using MATLAB R2009b and I need to write a larger script that converts the contents of a larger set of .zip files to v7.3 mat files (with an underlying HDF5-datamodel). Reading is OK. The issue is with saving. And there is actually no problem. My files saves nicely using the save command.

我的问题更多的是在意义上:为什么我在MATLAB中观察到以下令人惊讶的行为(对我而言)?

My question is more in the sense: Why am I observing the following surprising (for me) behavior in MATLAB?

让我们大致看一下我的问题.在当前的测试场景中,我将生成一个输出:-v7.3 mat-file.该.mat文件将包含40个 blocks 作为单个变量.每个变量将从1到40命名为"block_NNN",并将包含具有字段 frames blockNo 的结构.字段 frames 包含480x240x65的uint8图像数据序列(此处仅是使用 randi 生成的随机数据).字段 blockNo 包含块号.

let's look at my issue in general. In this current test-scenario I will be generating one output: A -v7.3 mat-file. This .mat-file will contain 40 blocks as individual variables. Each variable will be named "block_NNN" from 1 to 40 and will contain a struct with fields frames and blockNo. Field frames contains a 480x240x65 sequence of uint8 imagedata (here just random data generated using randi). Field blockNo contains the block number.

备注:在真实脚本中(我尚未完成),我将总共执行370次以上操作,总共转换了108GB的原始数据.这就是为什么我关注以下内容.

无论如何,首先我定义一些通用变量:

Anyway, first I define some general variables:


% some sizes for dummy data and loops:
num_blockCount = 40;
num_blockLength = 65;
num_frameHeight = 480;
num_frameWidth = 240;

然后我生成一些形状和大小与实际原始数据相同的伪代码:

I then generate some dummy code that has shape and size identical to the actual raw data:


% generate empty struct:
stu_data2disk = struct();

% loop over blocks:
for num_k = 1:num_blockCount

   % generate block-name:
   temp_str_blockName = sprintf('block_%03u', num_k);

   % generate temp struct for current block:
   temp_stu_value = struct();
   temp_stu_value.frames = randi( ...
      [0 255], ...
      [num_frameHeight num_frameWidth num_blockLength], ...
      'uint8' ...
   );
   temp_stu_value.blockNo = num_k;

   % using dynamic field names:
   stu_data2disk.(sprintf('block_%03u', num_k)) = temp_stu_value;

end

现在,我所有的随机测试数据都存储在 stu_data2disk 结构中.现在,我想使用两种可能的方法之一来保存数据.

I now have all my random test-data in a struct stu_data2disk. Now I would like to save the data using one of two possible methods.

让我们先尝试简单的方法:

Let's try the simple one first:


% save data (simple):
disp('Save data the simple way:')
tic;
save converted.mat -struct stu_data2disk -v7.3;
toc;

文件写入没有问题(286MB).输出为:

The file is written without problems (286MB). The output is:


Save data the simple way:
Elapsed time is 14.004449 seconds.

好的-然后我想起了我要遵循40个程序段的保存过程.因此,代替上面的内容,我遍历这些块并按顺序附加它们:

OK - then I remembered that I would like to follow the save-procedure over the 40 blocks. Thus instead of the above I loop over the blocks and append them in sequence:


% save to file, using append:
disp('Save data using -append:')
tic;
for num_k = 1:num_blockCount

   % generate block-name:
   temp_str_blockName = sprintf('block_%03u', num_k);

   temp_str_appendToggle = '';
   if (num_k > 1)
      temp_str_appendToggle = '-append';
   end

   % generate save command:
   temp_str_saveCommand = [ ...
      'save ', ...
      'converted_append.mat ', ...
      '-struct stu_data2disk ', temp_str_blockName, ' '...
      temp_str_appendToggle, ' ', ...
      '-v7.3', ...
      ';' ...
   ];

   % evaluate save command:
   eval(temp_str_saveCommand);

end
toc;

文件保存得很好(286MB).输出为:

And again the file saves nicely (286MB). The output is:


Save data using -append:
Elapsed time is 0.956968 seconds.

有趣的是,追加方法要快得多? 我的问题是为什么?

Interestingly the append-method is much faster? My question is why?

dir converted*.mat 的输出:

Output from dir converted*.mat:


09-02-2011  20:38       300,236,392 converted.mat
09-02-2011  20:37       300,264,316 converted_append.mat
               2 File(s)    600,500,708 bytes

文件大小不同.在Windows 7中使用 fc 进行的测试表明……有很多二进制差异.也许数据有些偏移-因此,这没有告诉我们.

The files are not identical in size. And a test with fc in windows 7 revealed ... well many binary differences. Perhaps the data was shifted a bit - thus this tells us nothing.

有人知道这里发生了什么吗?附加文件是否使用了更为优化的数据结构?还是Windows缓存了文件并使访问速度更快?

Does someone have an idea what is going on here? Is the appended file using a much more optimized data-structure perhaps? Or maybe windows has cached the file and makes access to it much faster?

我也努力从这两个文件中读取测试内容.在此处不提供数字的情况下,附加版本要快一些(但从长远来看可能意味着某些事情).

I made the effort of test-reading from the two files as well. Without presenting the numbers here the appended version was a little bit faster (could mean something in the long run though).

:我只是尝试不使用格式标志(系统上的默认值为-v7),所以没有太大区别了:

: I just tried using no format flag (defaults to -v7 on my system) and there is not much difference anymore:


Save data the simple way (-v7):
Elapsed time is 13.092084 seconds.
Save data using -append (-v7):
Elapsed time is 14.345314 seconds.

:我更正了上述错误.之前我提到过,统计数据是针对-v6的,但我弄错了.我刚刚删除了格式标志,并假定默认值为-v6,但实际上它是-v7.

: I corrected the above mistake. Previously I mentioned that the stats were for -v6 but I was mistaken. I had just removed the format flag and assumed the default was -v6 but actually it is -v7.

我已使用安德鲁(Andrew)的优良框架为系统上的所有格式创建了新的测试统计信息(所有格式均用于相同的随机测试数据,现在可以从文件中读取):

I have created new test stats for all formats on my system using Andrew's fine framework (all formats are for the same random test data, now read from file):


15:15:51.422: Testing speed, format=-v6, R2009b on PCWIN, arch=x86, os=Microsoft Windows 7 Professional  6.1.7600 N/A Build 7600
15:16:00.829: Save the simple way:            0.358 sec
15:16:01.188: Save using multiple append:     7.432 sec
15:16:08.614: Save using one big append:      1.161 sec

15:16:24.659: Testing speed, format=-v7, R2009b on PCWIN, arch=x86, os=Microsoft Windows 7 Professional  6.1.7600 N/A Build 7600
15:16:33.442: Save the simple way:           12.884 sec
15:16:46.329: Save using multiple append:    14.442 sec
15:17:00.775: Save using one big append:     13.390 sec

15:17:31.579: Testing speed, format=-v7.3, R2009b on PCWIN, arch=x86, os=Microsoft Windows 7 Professional  6.1.7600 N/A Build 7600
15:17:40.690: Save the simple way:           13.751 sec
15:17:54.434: Save using multiple append:     3.970 sec
15:17:58.412: Save using one big append:      6.138 sec

文件大小:


10-02-2011  15:16       299,528,768 converted_format-v6.mat
10-02-2011  15:16       299,528,768 converted_append_format-v6.mat
10-02-2011  15:16       299,528,832 converted_append_batch_format-v6.mat
10-02-2011  15:16       299,894,027 converted_format-v7.mat
10-02-2011  15:17       299,894,027 converted_append_format-v7.mat
10-02-2011  15:17       299,894,075 converted_append_batch_format-v7.mat
10-02-2011  15:17       300,236,392 converted_format-v7.3.mat
10-02-2011  15:17       300,264,316 converted_append_format-v7.3.mat
10-02-2011  15:18       300,101,800 converted_append_batch_format-v7.3.mat
               9 File(s)  2,698,871,005 bytes

因此-v6似乎是最快的书写方式.文件大小也没有太大差异.据我所知,HDF5确实内置了一些基本的充气方法.

Thus -v6 seems to be the fastest for writing. Also not any large differences in files sizes. HDF5 does have some basic inflate-method built-in as far as I know.

嗯,可能是对底层HDF5写入功能的一些优化?

目前,我仍然认为某些基本的HDF5基本写入功能已针对将数据集添加到HDF5文件进行了优化(在向-7.3文件中添加新变量时会发生这种情况).我相信我已经读过HDF5应该以这种方式进行优化的地方了……虽然不能确定.

Currently I still think that some underlying fundamental HDF5-write function is optimized for adding datasets to an HDF5-file (which is what happens when adding new variables to a -7.3 file). I believe I have read somewhere that HDF5 should optimized in this very way... though cannot be sure.

其他需要注意的细节:

正如我们在下面的安德鲁的答案中看到的那样,该行为非常系统化.对于是否在函数的局部范围或m脚本的全局"范围内运行这些操作,这似乎也非常重要.我的第一个结果来自一个m-script,该文件将文件写入当前目录.我仍然只能在m脚本中重现-7.3的1秒写操作.函数调用显然会增加一些开销.

The behavior is very systemic as we see in Andrew's answer below. It also seems to be quite important as to whether or not you run these things in a local scope of a function or in the "global" of an m-script. My first results were from an m-script where files were written to the current directory. I can still only reproduce the 1-second write for -7.3 in the m-script. The function-calls add some overhead apparently.

2016年7月更新:

我再次发现了这一点,并认为我可以使用当前可用的最新MATLAB对它进行测试.在Windows 7 x64上使用MATLAB R2016a,问题似乎已得到解决:

I found this again and thought I might test it with the newest MATLAB available to me at the moment. With MATLAB R2016a on Windows 7 x64 the problem seems to have been fixed:


14:04:06.277: Testing speed, imax=255, R2016a on PCWIN64, arch=AMD64, 16 GB, os=Microsoft Windows 7 Enterprise  Version 6.1 (Build 7601: Service Pack 1)
14:04:10.600: basic -v7.3:                    7.599 sec      5.261 GB used
14:04:18.229: basic -v7.3:                    7.894 sec      5.383 GB used
14:04:26.154: basic -v7.3:                    7.909 sec      5.457 GB used
14:04:34.096: basic -v7.3:                    7.919 sec      5.498 GB used
14:04:42.048: basic -v7.3:                    7.886 sec      5.516 GB used     286 MB file   7.841 sec mean
14:04:50.581: multiappend -v7.3:              7.928 sec      5.819 GB used
14:04:58.544: multiappend -v7.3:              7.905 sec      5.834 GB used
14:05:06.485: multiappend -v7.3:              8.013 sec      5.844 GB used
14:05:14.542: multiappend -v7.3:              8.591 sec      5.860 GB used
14:05:23.168: multiappend -v7.3:              8.059 sec      5.868 GB used     286 MB file   8.099 sec mean
14:05:31.913: bigappend -v7.3:                7.727 sec      5.837 GB used
14:05:39.676: bigappend -v7.3:                7.740 sec      5.879 GB used
14:05:47.453: bigappend -v7.3:                7.645 sec      5.884 GB used
14:05:55.133: bigappend -v7.3:                7.656 sec      5.877 GB used
14:06:02.824: bigappend -v7.3:                7.963 sec      5.871 GB used     286 MB file   7.746 sec mean

这已在以下接受的答案中使用Andrew Janke的reproMatfileAppendSpeedup函数进行了测试(5遍,格式7.3).现在,-append到一次保存的速度同样慢或慢-应当. R2009a中使用的HDF5驱动程序的早期构建可能是一个问题.

This was tested with Andrew Janke's reproMatfileAppendSpeedup function in the accepted answer below (5 passes with format 7.3). Now, -append is equally slow, or slower, to a single save - as it should be. Perhaps it was a problem with an early build of the HDF5 driver used in R2009a.

推荐答案

圣牛.我可以复制.也尝试了单追加变体;它甚至更快.看起来像附加"只是神奇地使基于HDF5的save()快了30倍.我没有解释,但我想分享我发现的东西.

Holy cow. I can reproduce. Tried the single-append variation too; it's even speedier. Looks like "-append" just magically makes HDF5-based save() 30x faster. I don't have an explanation but I wanted to share what I found.

我将您的测试代码包装在一个函数中,对其进行重构以使保存逻辑与测试数据结构无关,以便您可以在其他数据集上运行它,并添加了更多诊断输出.

I wrapped up your test code in a function, refactoring it to make the save logic agnostic about the test data structure so you can run it on other data sets, and added some more diagnostic output.

看不到到处都有很大的加速.在我的64位XP机器和32位Server 2003机器上,它是巨大的,在我的64位Windows 7机器上是巨大的,而在32位XP机器上却不存在. (尽管多次追加在Server 2003上造成了巨大损失.)在许多情况下,R2010b的运行速度较慢.也许HDF5会附加或保存对它的使用,只是会在较新的Windows版本上摇摆不定. (XP x64实际上是Server 2003内核.)或者也许仅仅是计算机配置上的差异. XP x64计算机上有一个快速RAID,而32位XP的RAM比其余的要少.您正在运行什么操作系统和体系结构?你也可以尝试这个repro吗?

Don't see the big speedup everywhere. It's huge on my 64-bit XP box and a 32-bit Server 2003 box, big on my 64-bit Windows 7 box, nonexistent on a 32-bit XP box. (Though multiple appends are a huge loss on Server 2003.) R2010b is slower in many cases. Maybe HDF5 appends or save's use of it just rock on newer Windows builds. (XP x64 is actually the Server 2003 kernel.) Or maybe it's just a machine config difference. There's a fast RAID on the XP x64 machine, and the 32-bit XP has less RAM than the rest. What OS and architecture are you running? Can you try this repro too?

19:36:40.289: Testing speed, format=-v7.3, R2009b on PCWIN64, arch=AMD64, os=Microsoft(R) Windows(R) XP Professional x64 Edition 5.2.3790 Service Pack 2 Build 3790
19:36:55.930: Save the simple way:           11.493 sec
19:37:07.415: Save using multiple append:     1.594 sec
19:37:09.009: Save using one big append:      0.424 sec


19:39:21.681: Testing speed, format=-v7.3, R2009b on PCWIN, arch=x86, os=Microsoft Windows XP Professional 5.1.2600 Service Pack 3 Build 2600
19:39:37.493: Save the simple way:           10.881 sec
19:39:48.368: Save using multiple append:    10.187 sec
19:39:58.556: Save using one big append:     11.956 sec


19:44:33.410: Testing speed, format=-v7.3, R2009b on PCWIN64, arch=AMD64, os=Microsoft Windows 7 Professional  6.1.7600 N/A Build 7600
19:44:50.789: Save the simple way:           14.354 sec
19:45:05.156: Save using multiple append:     6.321 sec
19:45:11.474: Save using one big append:      2.143 sec


20:03:37.907: Testing speed, format=-v7.3, R2009b on PCWIN, arch=x86, os=Microsoft(R) Windows(R) Server 2003, Enterprise Edition 5.2.3790 Service Pack 2 Build 3790
20:03:58.532: Save the simple way:           19.730 sec
20:04:18.252: Save using multiple append:    77.897 sec
20:05:36.160: Save using one big append:      0.630 sec

这看起来很大.如果它可以容纳其他数据集,那么我可能会在很多地方使用此技巧. MathWorks也可能会提出一些建议.他们也可以在普通保存或其他OS版本中使用快速追加技术吗?

This looks huge. If it holds up on other data sets, I might use this trick in a lot of places myself. It may be something to bring up with MathWorks, too. Could they use the fast append technique in normal saves or other OS versions, too?

这是自包含的复制功能.

Here's the self-contained repro function.

function out = reproMatfileAppendSpeedup(nPasses, tests, imax, formats)
%REPROMATFILEAPPENDSPEEDUP Show how -append makes v7.3 saves much faster
%
% Examples:
% reproMatfileAppendSpeedup()
% reproMatfileAppendSpeedup(2, [], 0, {'7.3','7','6'}); % low-entropy test

if nargin < 1 || isempty(nPasses);  nPasses = 1;  end
if nargin < 2 || isempty(tests);    tests = {'basic','multiappend','bigappend'}; end
if nargin < 3 || isempty(imax);     imax = 255; end
if nargin < 4 || isempty(formats);  formats = '7.3'; end % -v7 and -v6 do not show the speedup
tests = cellstr(tests);
formats = cellstr(formats);

fprintf('%s: Testing speed, imax=%d, R%s on %s\n',...
    timestamp, imax, version('-release'), systemDescription());

tempDir = setupTempDir();
testData = generateTestData(imax);

testMap = struct('basic','saveSimple', 'multiappend','saveMultiAppend', 'bigappend','saveBigAppend');

for iFormat = 1:numel(formats)
    format = formats{iFormat};
    formatFlag = ['-v' format];
    %fprintf('%s: Format %s\n', timestamp, formatFlag);
    for iTest = 1:numel(tests)
        testName = tests{iTest};
        saveFcn = testMap.(testName);
        te = NaN(1, nPasses);
        for iPass = 1:nPasses
            fprintf('%s: %-30s', timestamp, [testName ' ' formatFlag ':']);
            t0 = tic;
            matFile = fullfile(tempDir, sprintf('converted-%s-%s-%d.mat', testName, format, i));
            feval(saveFcn, matFile, testData, formatFlag);
            te(iPass) = toc(t0);
            if iPass == nPasses
                fprintf('%7.3f sec      %5.3f GB used   %5.0f MB file   %5.3f sec mean\n',...
                    te(iPass), physicalMemoryUsed/(2^30), getfield(dir(matFile),'bytes')/(2^20), mean(te));
            else
                fprintf('%7.3f sec      %5.3f GB used\n', te(iPass), physicalMemoryUsed/(2^30));
            end
        end
        % Verify data to make sure we are sane
        gotBack = load(matFile);
        gotBack = rmfield(gotBack, intersect({'dummy'}, fieldnames(gotBack)));
        if ~isequal(gotBack, testData)
            fprintf('ERROR: Loaded data differs from original for %s %s\n', formatFlag, testName);
        end
    end
end

% Clean up
rmdir(tempDir, 's');

%%
function saveSimple(file, data, formatFlag)
save(file, '-struct', 'data', formatFlag);

%%
function out = physicalMemoryUsed()
if ~ispc
    out = NaN;
    return; % memory() only works on Windows
end
[u,s] = memory();
out = s.PhysicalMemory.Total - s.PhysicalMemory.Available;

%%
function saveBigAppend(file, data, formatFlag)
dummy = 0;
save(file, 'dummy', formatFlag);
fieldNames = fieldnames(data);
save(file, '-struct', 'data', fieldNames{:}, '-append', formatFlag);

%%
function saveMultiAppend(file, data, formatFlag)
fieldNames = fieldnames(data);
for i = 1:numel(fieldNames)
    if (i > 1); appendFlag = '-append'; else; appendFlag = ''; end
    save(file, '-struct', 'data', fieldNames{i}, appendFlag, formatFlag);
end


%%
function testData = generateTestData(imax)
nBlocks = 40;
blockSize = [65 480 240];
for i = 1:nBlocks
    testData.(sprintf('block_%03u', i)) = struct('blockNo',i,...
        'frames', randi([0 imax], blockSize, 'uint8'));
end

%%
function out = timestamp()
%TIMESTAMP Showing timestamps to make sure it is not a tic/toc problem
out = datestr(now, 'HH:MM:SS.FFF');

%%
function out = systemDescription()
if ispc
    platform = [system_dependent('getos'),' ',system_dependent('getwinsys')];
elseif ismac
    [fail, input] = unix('sw_vers');
    if ~fail
        platform = strrep(input, 'ProductName:', '');
        platform = strrep(platform, sprintf('\t'), '');
        platform = strrep(platform, sprintf('\n'), ' ');
        platform = strrep(platform, 'ProductVersion:', ' Version: ');
        platform = strrep(platform, 'BuildVersion:', 'Build: ');
    else
        platform = system_dependent('getos');
    end
else
    platform = system_dependent('getos');
end
arch = getenv('PROCESSOR_ARCHITEW6432');
if isempty(arch)
    arch = getenv('PROCESSOR_ARCHITECTURE');
end
try
    [~,sysMem] = memory();
catch
    sysMem.PhysicalMemory.Total = NaN;
end
out = sprintf('%s, arch=%s, %.0f GB, os=%s',...
    computer, arch, sysMem.PhysicalMemory.Total/(2^30), platform);

%%
function out = setupTempDir()
out = fullfile(tempdir, sprintf('%s - %s', mfilename, datestr(now, 'yyyymmdd-HHMMSS-FFF')));
mkdir(out);

我修改了repro函数,添加了多个迭代并将其参数化以用于randi生成器的保存样式,文件格式和imax.

I modified the repro function, adding multiple iterations and parameterizing it for save styles, file formats, and imax for the randi generator.

我认为文件系统缓存是快速追加行为的重要因素.当我使用reproMatfileAppendSpeedup(20)连续运行一堆并在Process Explorer中查看系统信息时,其中大多数时间不到一秒钟,并且物理内存使用量迅速提高了几GB.然后,每打十次,写入停顿并花费20或30秒,物理RAM的使用量缓慢下降到大约开始的位置.我认为这意味着Windows在RAM中缓存了大量写操作,而-append使得它更愿意这样做.但是对我来说,包括摊位在内的摊销时间仍然比基本节省要快得多.

I think filesystem caching is a big factor to the fast -append behavior. When I do a bunch of runs in a row with reproMatfileAppendSpeedup(20) and watch System Information in Process Explorer, most of them are under a second, and physical memory usage quickly ramps up by a couple GB. Then every dozen passes, the write stalls and takes 20 or 30 seconds, and physical RAM usage slowly ramps down to about where it started. I think this means that Windows is caching a lot of writes in RAM, and something about -append makes it more willing to do so. But the amortized time including those stalls is still a lot faster than the basic save, for me.

顺便说一句,经过几个小时的多次传球后,我很难重现原始的计时.

By the way, after doing multiple passes for a couple hours, I'm having a hard time reproducing the original timings.

这篇关于MATLAB:将几个变量保存到"-v7.3" (HDF5).mat文件在使用&amp; quot; -append&amp; amp;&quot;时似乎更快.旗帜.怎么会?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆