如何有效计算字符串的单元格数组的字符串长度 [英] How to compute effectively string length of cell array of strings

查看:85
本文介绍了如何有效计算字符串的单元格数组的字符串长度的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在Matlab中有一个单元格数组:

I have a cell array in Matlab:

strings = {'one', 'two', 'three'};

如何有效地计算所有三个字符串的长度?现在,我使用一个for循环:

How can I efficiently calculate the length of all three strings? Right now I use a for loop:

lengths = zeros(3,1);
for i = 1:3
    lengths(i) = length(strings{i});
end

但是,当您有大量的字符串(我有480,863个)时,这是无法使用的缓慢.有什么建议吗?

This is however unusable slow when you have a large amount of strings (I've got 480,863 of them). Any suggestions?

推荐答案

您还可以使用:

cellfun(@length, strings)

它不会更快,但是会使代码更清晰.
关于速度,您应该首先运行探查器以检查瓶颈在哪里.只有这样,您才能进行优化.

It will not be faster, but makes the code clearer.
Regarding the slowness, you should first run the profiler to check where the bottleneck is. Only then should you optimize.

编辑:我只是想起了'length'曾经是旧版Matlab中cellfun中的内置函数.因此实际上可能更快!试试

Edit: I just recalled that 'length' used to be a built-in function in cellfun in older Matlab versions. So it might actually be faster! Try

 cellfun('length',strings)

编辑(2):我必须承认我的第一个答案是一个疯狂的猜测.在@Rodin发表评论后,我决定检查一下加速比.

Edit(2) : I have to admit that my first answer was a wild guess. Following @Rodin s comment, I decided to check out the speedup.

以下是基准代码:

首先,生成大量字符串并保存到磁盘的代码:

First, the code that generates a lot of strings and saves to disk:

function GenerateCellStrings()
    strs = cell(1,10000);
    for i=1:10000
        strs{i} = GenerateRandomString();
    end
    save strs;
end

function st = GenerateRandomString()
    MAX_STR_LENGTH = 1000;
    n = randi(MAX_STR_LENGTH);
    st = char(randi([97 122], 1,n ));

end

然后,基准本身:

 function CheckRunTime()
    load strs;
    tic;
    disp('Loop:');
    for i=1:numel(strs)
        n = length(strs{i});
    end
    toc;

    disp('cellfun (String):');
    tic;
    cellfun('length',strs);
    toc;

    disp('cellfun (function handle):');
    tic;
    cellfun(@length,strs);
    toc;

end

结果是:

循环:
经过的时间为 0.010663 秒.
cellfun(字符串):
经过的时间为 0.000313 秒.
cellfun(功能句柄):
经过的时间为 0.006280 秒.

Loop:
Elapsed time is 0.010663 seconds.
cellfun (String):
Elapsed time is 0.000313 seconds.
cellfun (function handle):
Elapsed time is 0.006280 seconds.

哇! 'length'语法比循环快30倍!我只能猜测为什么它会变得如此之快.也许事实是它专门识别length.可能是JIT优化.

Wow!! The 'length' syntax is about 30 times faster than a loop! I can only guess why it becomes so fast. Maybe the fact that it recognizes length specifically. Might be JIT optimization.

编辑(3)-我发现了原因以提高速度.实际上,确实是length的识别.感谢@reve_etrange提供的信息.

Edit(3) - I found out the reason for the speedup. It is indeed recognition of length specifically. Thanks to @reve_etrange for the info.

这篇关于如何有效计算字符串的单元格数组的字符串长度的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆