重复向量的元素 [英] Repeat elements of vector

查看:152
本文介绍了重复向量的元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含元素i的值向量A,例如:

A = [0.1 0.2 0.3 0.4 0.5]; 然后说r = [5 2 3 2 1];

现在,我想创建一个新的向量Anew,其中包含A中值ir(i)重复项,以使Anew中的前r(1)=5个项目具有值A(1)和长度新向量的sum(r).因此:

Anew = [0.1 0.1 0.1 0.1 0.1 0.2 0.2 0.3 0.3 0.3 0.4 0.4 0.5]

我确信这可以通过精心设计的for循环组合完成,例如repmat,但是有人知道如何以更流畅的方式做到这一点吗?

据我所知,尽管R具有 arrayfun 单线执行此操作...从技术上讲,将它处理成单个向量需要两个后处理.因此,您可以尝试以下方法:

Anew = arrayfun(@(x) repmat(A(x), r(x), 1), 1:numel(A), 'uni', 0);
Anew = vertcat(Anew{:});

这基本上用更少的代码完成了for循环和复制矢量的级联.我们遍历Ar中的每对值,并吐出复制的向量.它们每个都将位于一个单元格数组中,这就是为什么 vertcat 都放入一个向量中.

我们得到:

Anew =

    0.1000
    0.1000
    0.1000
    0.1000
    0.1000
    0.2000
    0.2000
    0.3000
    0.3000
    0.3000
    0.4000
    0.4000
    0.5000

请注意,其他人已经尝试了与您在此帖子中所做的类似的事情: 解决方案

As far as I'm aware, there is no equivalent function to do that in MATLAB, though R has rep that can do that for you.... so jealous.

In any case, the only way I can suggest is to run a for loop with repmat as you suggested. However, you can perhaps do arrayfun instead if you want to do this as a one-liner... well it's technically two to do the post-processing required to get this into a single vector. As such, you can try this:

Anew = arrayfun(@(x) repmat(A(x), r(x), 1), 1:numel(A), 'uni', 0);
Anew = vertcat(Anew{:});

This essentially does the for loop and concatenation of the replicated vectors with less code. We go through each pair of values in A and r and spit out replicated vectors. Each of them will be in a cell array, which is why vertcat is required to put it all into one vector.

We get:

Anew =

    0.1000
    0.1000
    0.1000
    0.1000
    0.1000
    0.2000
    0.2000
    0.3000
    0.3000
    0.3000
    0.4000
    0.4000
    0.5000

Take note that other people have tried something similar to what you're doing in this post: A similar function to R's rep in Matlab. This is essentially mimicking R's way of doing rep, which is what you want to do!


Alternative - Using for loops

Because of @Divakar's benchmarking, I'm curious to see how pre-allocating the array, then using an actual for loop to iterate through A and r and populate it by indexing would benchmark. As such, the equivalent code to the above using for loops and indexing would be:

Anew = zeros(sum(r), 1);
counter = 1;
for idx = 1 : numel(r)
    Anew(counter : counter + r(idx) - 1) = A(idx);
    counter = counter + r(idx);
end

We would need a variable that keeps track of where we need to insert elements in the array, which is stored in counter. We offset this by the total number of elements to replicate per number, which is stored in each value of r.

As such, this method completely avoids using repmat and just uses indexing to generate our replicated vectors instead.


Benchmarking (à la Divakar)

Building on top of Divakar's benchmarking code, I actually tried running all of the tests on my machine, in addition to the for loop approach. I simply used his benchmarking code with the same test cases.

These are the timing results I get per algorithm:

Case #1 - N = 4000, max_repeat = 4000

-------------------  With arrayfun
Elapsed time is 1.202805 seconds.
-------------------  With cumsum
Elapsed time is 1.691591 seconds.
-------------------  With bsxfun
Elapsed time is 0.835201 seconds.
-------------------  With for loop
Elapsed time is 0.136628 seconds.

Case #2 - N = 10000, max_repeat = 1000

-------------------  With arrayfun
Elapsed time is 2.117631 seconds.
-------------------  With cumsum
Elapsed time is 1.080247 seconds.
-------------------  With bsxfun
Elapsed time is 0.540892 seconds.
-------------------  With for loop
Elapsed time is 0.127728 seconds.

In these cases, cumsum actually beats out arrayfun... which is what I originally expected. bsxfun beats everyone else out, except for the for loop. My guess is with the differing times in arrayfun between myself and Divakar, we are running our code on different architectures. I'm currently running my tests using MATLAB R2013a on a Mac OS X 10.9.5 MacBook Pro machine.

As we can see, the for loop is much quicker. I know for a fact that when it comes to indexing operations in a for loop, the JIT kicks in and gives you better performance.

这篇关于重复向量的元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆