对于X中的每个元素,找到最大索引而不在Y中进行遍历 [英] For each element in X find index of largest without going over in Y

查看:101
本文介绍了对于X中的每个元素,找到最大索引而不在Y中进行遍历的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种改善以下算法性能的方法.给定两个数组X和Y.

I'm looking for a way to improve the performance of the following algorithm. Given two arrays X and Y.

对于X的每个元素,找到Y中不超过X中元素值的最大值的索引.可以安全地假设X和Y单调递增(排序)并且Y(1)为小于X中的每个值. 而且X通常比Y大得多.

For each element of X find the index of the largest value in Y that does not exceed the value of the element in X. It is safe to assume X and Y are monotonically increasing (sorted) and that Y(1) is less than every value in X. Also X is generally much larger than Y.

作为示例,给出以下内容.

As an example given the following.

X = [0.2, 1.5, 2.2, 2.5, 3.5, 4.5, 5.5, 5.8, 6.5];
Y = [0.0, 1.0, 3.0, 4.0, 6.0];

我希望输出是

idx = [1, 2, 2, 2, 3, 4, 4, 4, 5]

我想出的最快方法是下面的函数,该函数无法利用列表已排序并使用for循环遍历数组之一这一事实.这提供了一个有效的解决方案,但在我使用此功能进行的实验中,分析运行总共需要30分钟,而在这里花费了将近27分钟.

The fastest way I've come up with is the function below which fails to take advantage of the fact that the lists are sorted and uses a for loop to step through one of the arrays. This gives a valid solution but on the experiments I'm using this function for, nearly 27 minutes are spent here out of a total 30 minutes the analysis takes to run.

function idx = matchintervals(X,Y)
  idx = zeros(size(X));
  for i = 1:length(Y)-1
    idx(X >= Y(i) & X < Y(i+1)) = i;
  end
  idx(X >= Y(end)) = length(Y);
end

非常感谢您的帮助.

推荐答案

如果您正在寻找最快的解决方案,那么它最终可能会成为一个简单的while循环,就像这样(它利用了对数组进行排序的事实) ):

If you're looking for the fastest solution, it might end up being a simple while loop like so (which takes advantage of the fact that the arrays are sorted):

X = [0.2, 1.5, 2.2, 2.5, 3.5, 4.5, 5.5, 5.8, 6.5];
Y = [0.0, 1.0, 3.0, 4.0, 6.0];

xIndex = 1;
nX = numel(X);
yIndex = 1;
nY = numel(Y);
index = zeros(size(X))+nY;  % Prefill index with the largest index in Y

while (yIndex < nY) && (xIndex <= nX)
  if X(xIndex) < Y(yIndex+1)
    index(xIndex) = yIndex;
    xIndex = xIndex+1;
  else
    yIndex = yIndex+1;
  end
end

>> index

index =

     1     2     2     2     3     4     4     4     5

此循环最多可重复numel(X)+numel(Y)-1次,如果X中的许多值大于Y中的最大值,则循环次数可能会减少.

This loop will iterate a maximum of numel(X)+numel(Y)-1 times, potentially fewer if there are many values in X that are greater than the largest value in Y.

时间:我用注释中的示例数据运行了一些时间.以下是从最快到最慢排序的结果:

TIMINGS: I ran some timings with the sample data from a comment. Here are the results sorted from fastest to slowest:

X = 1:3:(4e5);
Y = 0:20:(4e5-1);

% My solution from above:
tElapsed =
   0.003005977477718 seconds

% knedlsepp's solution:
tElapsed =
   0.006939387719075 seconds

% Divakar's solution:
tElapsed =
   0.011801273498343 seconds

% H.Muster's solution:
tElapsed =
   4.081793325423575 seconds

这篇关于对于X中的每个元素,找到最大索引而不在Y中进行遍历的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆