数据集数组索引使用统计工具箱非常慢 [英] Dataset array indexing is very slow with Statistics Toolbox

查看：93 发布时间：2020/5/6 14:49:15 matlab

本文介绍了数据集数组索引使用统计工具箱非常慢的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

为什么索引到数据集数组的速度太慢? dataset.subsref函数的峰值表明，数据集的所有列都存储在单元格数组中.但是，单元索引比数据集索引快得多，数据集索引只是将数据集索引到底层单元格中.我的猜测是，这与MATLAB OOP的一些开销有关.关于如何加快速度的任何想法?

Why is indexing into a dataset array so slow? A peak into the dataset.subsref function shows that all the columns of the dataset are stored in a cell array. However, cell indexing is much, much faster than dataset indexing, which is just indexing into a cell array under the hood. My guess is that this has to do with some overhead with MATLAB OOP. Any ideas on how to speed this up?

%% Using R2011a, PCWIN64
feature accel off;  % turn off JIT

dat = (1:1e6)';
dat2 = repmat({'abc'}, 1e6, 1);
celldat = {dat dat2};
ds = dataset(dat, dat2);
N = 1e2;

tic;
for j = 1:N
    tmp = celldat{2};
end
toc;

tic;
for j = 1:N
    tmp2 = ds.dat2; % 2.778sec spent on line 262 of dataset.subsref
end
toc;

feature accel on;  % turn JIT back on

Elapsed time is 0.000165 seconds.
Elapsed time is 2.778995 seconds.

编辑:我已经更新了示例，使其更类似于我所遇到的问题.在dataset.subsref的第262行上花费了大量时间-"b = a.data {varIndex};".这对我来说很奇怪，因为它是一个简单的单元格取消引用.我想知道是否有一个OOP技巧可以使我索引到"a.data"，而不会产生奇怪的开销.

I've updated the example to be more like the problem I'm seeing. A huge amount of time is spent on line 262 of dataset.subsref - "b = a.data{varIndex};". It's very strange to me since it is a simple cell dereference. I'm wondering if there is a OOP trick that will allow me to index into "a.data" without the strange overhead.

EDIT2 :按照安德鲁的建议，我已将此错误提交给MatWorks.如果我收到他们的任何消息，将会更新.

As per Andrew's suggestion, I've submitted this as a bug to MatWorks. Will update if I hear anything from them.

EDIT3 :Matlab回答说，他们现在已经意识到了这个问题，并将在以后的版本中予以解决.他们指出，该问题是特定于单元阵列的，并在可能的情况下尽量避免使用它们.

Matlab responded and said they are aware of the problem now and will fix it in a future release. They noted that the problem is specific to cell arrays, and to try to avoid them if possible.

数据集数组索引使用统计工具箱非常慢 [英] Dataset array indexing is very slow with Statistics Toolbox

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

数据集数组索引使用统计工具箱非常慢 [英] Dataset array indexing is very slow with Statistics Toolbox

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭