用之前的非零值替换向量中的所有零 [英] Replace all zeros in vector by previous non-zero value

查看:32
本文介绍了用之前的非零值替换向量中的所有零的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Matlab/Octave 算法示例:

Matlab/Octave algorithm example:

 input vector: [ 1 0 2 0 7 7 7 0 5 0 0 0 9 ]
output vector: [ 1 1 2 2 7 7 7 7 5 5 5 5 9 ]

该算法非常简单:它遍历向量并用最后一个非零值替换所有零.这看起来微不足道,当用一个缓慢的 for (i=1:length) 循环完成并且能够引用前一个元素 (i-1) 时也是如此,但看起来不可能以快速矢量化形式表达.我尝试了 merge() 和 shift() 但它只适用于第一次出现的零,而不是任意数量的零.

The algorithm is very simple: it goes through the vector and replaces all zeros with the last non-zero value. It seems trivial, and is so when done with a slow for (i=1:length) loop and being able to refer to the previous element (i-1), but looks impossible to be formulated in the fast vectorized form. I tried the merge() and shift() but it only works for the first occurrence of zero, not an arbitrary number of them.

可以在 Octave/Matlab 中以矢量化形式完成还是必须使用 C 才能在大量数据上具有足够的性能?

Can it be done in a vectorized form in Octave/Matlab or must C be used for this to have sufficient performance on big amount of data?

我有另一种类似的慢速 for-loop 算法来加速,并且在矢量化中引用以前的值似乎通常是不可能的形式,如 SQL lag()group byloop (i-1) 很容易做到.但是 Octave/Matlab 循环非常慢.

I have another similar slow for-loop algorithm to speed up and it seems generally impossible to refer to previous values in a vectorized form, like an SQL lag() or group by or loop (i-1) would easily do. But Octave/Matlab loops are terribly slow.

有没有人找到这个普遍问题的解决方案,或者这对于基本的 Octave/Matlab 设计原因是徒劳的?

Has anyone found a solution to this general problem or is this futile for fundamental Octave/Matlab design reasons?

性能基准:

in = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1 ,100000);
out = in;
tic
for i=2:length(out) 
   if (out(i)==0) 
      out(i)=out(i-1);
   end
end
toc
[in(1:20); out(1:20)] % test to show side by side if ok

经过的时间是 15.047 秒.

Elapsed time is 15.047 seconds.

in = V = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1 ,100000);
tic;
d = double(diff([0,V])>0);
d(find(d(2:end))+1) = find(diff([0,~V])==-1) - find(diff([0,~V])==1);
out = V(cumsum(~~V+d)-1);
toc;
[in(1:20); out(1:20)] % shows it works ok

经过的时间是 0.188167 秒.

Elapsed time is 0.188167 seconds.

15.047/0.188167 = 79.97 倍提升

15.047 / 0.188167 = 79.97 times improvement

in = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] ,1 ,100000);
a = in;
tic;
pada = [a,888];
b = pada(pada >0);
bb = b(:,1:end-1);
c = find (pada==0);
d = find(pada>0);
len = d(2:end) - (d(1:end-1));
t = accumarray(cumsum([1,len])',1);
out = bb(cumsum(t(1:end-1)));
toc;

经过的时间是 0.130558 秒.

Elapsed time is 0.130558 seconds.

15.047/0.130558 = 115.25 倍提升

15.047 / 0.130558 = 115.25 times improvement

in = repmat([ 1 0 2 0 7 7 7 0 5 0 0 0 9 ] , 1, 100000);
tic;
u = nonzeros(in);
out = u(cumsum(in~=0)).';
toc;

经过的时间是 0.0597501 秒.

Elapsed time is 0.0597501 seconds.

15.047/0.0597501 = 251.83 倍提升

15.047 / 0.0597501 = 251.83 times improvement

Slow loop:    0.010862 seconds.
Dan:          0.072561 seconds.
GameOfThrows: 0.066282 seconds.
Luis Mendo:   0.032257 seconds.
fillmissing:  0.053366 seconds.

所以我们再次得出相同的结论:MATLAB 中的循环不再慢!

So we draw yet again the same conclusion: loops in MATLAB are no longer slow!

另见:Octave/Matlab 第二部分中的微不足道/不可能的算法挑战:迭代记忆

推荐答案

以下简单的方法可以满足您的需求,而且速度可能非常快:

The following simple approach does what you want, and is probably very fast:

in = [1 0 2 0 7 7 7 0 5 0 0 0 9];
t = cumsum(in~=0);
u = nonzeros(in);
out = u(t).';

这篇关于用之前的非零值替换向量中的所有零的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆