为什么重塑这么快? (剧透:写时复制) [英] Why is reshape so fast? (Spoiler: Copy-on-Write)

查看:68
本文介绍了为什么重塑这么快? (剧透:写时复制)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个大矩阵A,它是1GB的双精度​​值,当我将其整形为不同尺寸时,它的速度令人难以置信.

I have a big matrix A which is 1GB of double values, when I reshape it to different dimensions, it's incredible fast.

A=rand(128,1024,1024);
tic;B=reshape(A,1024,128,1024);toc

Elapsed time is 0.000011 seconds.

怎么会这么快?另一个观察结果,在运行该代码并存储两个每个1GB的矩阵之后,MATLAB使用的内存少于应该的内存:Memory used by MATLAB: 1878 MB (1.969e+09 bytes)

How can it be that fast? Another observation, MATLAB uses less memory than it should after running that code and storing two matrices of 1GB each: Memory used by MATLAB: 1878 MB (1.969e+09 bytes)

推荐答案

良好性能的说明

Matlab尽可能使用写时复制.如果编写类似B=A的表达式,MATLAB不会复制A,而是变量AB都是对相同数据结构的引用.仅当两个变量之一将被修改时,MATLAB才会创建一个副本.

Matlab uses copy-on-write whenever possible. If you write expressions like B=A, MATLAB does not copy A, instead both variables A and B are references to the same data structure. Only if one of the two variables will be modified, MATLAB will create a copy.

现在是reshape的特殊情况.在这里看起来A和B是不一样的,但是在内存中它们是一样的.保存数据的基础数组不受reshape操作的影响,无需移动任何内容:all(A(:)==B(:)).调用reshape时,MATLAB要做的所有事情都是创建一个新的引用,并使用矩阵的新维度对其进行注释. 重塑矩阵无非是创建对输入数据的新引用,该引用带有新维度.整形的运行时间少于1µs或大约是两个简单赋值(如<​​c1>)所需的时间.对于所有实际应用,零时间操作.

Now to the special case of reshape. Here it looks like A and B are not the same, but in memory they are. The underlying array which holds the data is unaffected by the reshape operation, nothing has to be moved: all(A(:)==B(:)). Everything MATLAB has to do when calling reshape is to create a new reference and annotate it with the new dimensions of the matrix. Reshaping a matrix is nothing more than creating a new reference to the input data, which is annotated with the new dimensions. The runtime of reshape is less than 1µs or roughly the time two simple assignments like B=A require. For all practical applications a zero time operation.

>> tic;for i=1:1000;B=reshape(A,1024,128,1024);end;toc
Elapsed time is 0.000724 seconds.
>> tic;for i=1:1000;B=A;end;toc
Elapsed time is 0.000307 seconds.

这样的引用实际上有多大还未知,但是我们可以假设它在几个字节之内.

It is unknown how large such a reference really is, but we can assume it to be within a few bytes.

其他零成本操作

已知功能的成本几乎为零(运行时和内存):

Functions known to have practically zero cost (both runtime and memory):

  • B=reshape(A,sz)
  • B=A(:)
  • B=A.'-仅适用于向量
  • B=A'-仅用于实数向量,没有属性 .请使用.'.
  • B=permute(A,p)-仅适用于all(A(:)==B(:)). 1
  • 的情况
  • B=ipermute(A,p)-仅适用于all(A(:)==B(:)). 1
  • 的情况
  • B=squeeze(A) 1
  • shiftdim-仅适用于all(A(:)==B(:))的情况,即: 1
    • 用于删除领先的单例尺寸.
    • 用于负第二个输入
    • 不带第二个输入参数使用.
    • B=reshape(A,sz)
    • B=A(:)
    • B=A.' - only for Vectors
    • B=A' - only for Vectors of real numbers, without the attribute complex. Use .' instead.
    • B=permute(A,p) - only for the cases where all(A(:)==B(:)).1
    • B=ipermute(A,p) - only for the cases where all(A(:)==B(:)).1
    • B=squeeze(A) 1
    • shiftdim - only for the cases where all(A(:)==B(:)), which are:1
      • used to remove leading singleton dimensions.
      • used with negative second input
      • used without second input argument.

      昂贵"的功能,无论它们没有触及内存中的表示(all(A(:)==B(:))是真的)

      Functions which are "expensive", regardless of the fact that they don't touch the representation in memory (all(A(:)==B(:)) is true)

      • 左侧索引:B(1:numel(A))=A; 2
      • (:)以外的右侧索引,包括B=A(1:end);B=A(:,:,:); 2
      • Left sided indexing: B(1:numel(A))=A; 2
      • Right sided indexing other than (:), including B=A(1:end); and B=A(:,:,:); 2

      1 的运行时间比reshape慢得多,在1µs和1ms之间.可能是因为某些恒定的计算开销.内存消耗实际上为零,并且运行时间与输入大小无关.没有此注释的操作的运行时间低于1µs,大致等于reshape.

      1 Significantly slower runtime than reshape between 1µs and 1ms. Probably because of some constant computation overhead. Memory consumption is practically zero and the runtime is independent from the input size. Operations without this annotation have a runtime below 1µs and roughly equivalent to reshape.

      2 OCTAVE中的零成本

      2 Zero cost in OCTAVE

      撰写本文时,最初使用了MATLAB 2013b.用MATLAB 2019b确认了数字.

      Originally used MATLAB 2013b when writing this post. Confirmed the numbers with MATLAB 2019b.

      这篇关于为什么重塑这么快? (剧透:写时复制)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆