高效的算法来计算一个排序的数组的pariwise绝对金额的中位数 [英] Efficient algorithm to compute the median of pariwise absolute sums of a sorted array

查看：195 发布时间：2015/11/30 15:10:28 c++ c algorithm language-agnostic

本文介绍了高效的算法来计算一个排序的数组的pariwise绝对金额的中位数的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图想出一个快速的算法来计算量 B [I] =配有| y_i + y_j |，1＆LT;！= J =＆LT; = N 时，在 Y_1，...，y_n 已经排序（因此 B [] 是一个向量相同长度的Y [] ）。我假设 Y [] 中的所有元素都是唯一的并且n为偶数。

所以，code以下计算 B [I] 的天真的（为O（n ** 2））的方式：（我在研发写了这个为了方便，但我的语言无关）

  N'LT; -30
a_fast＆LT; -b_slow＆LT; -rep（NA，N）
Y'LT; -sort（RNORM（N，100,1））
z，其中，-y
为（i的1：N）{
    b_slow [1]  - ; -median（ABS（Y [-i] +值Y [i]））
}

我有一个初步建议--below--做它 O（N）。但它只能如果 Y [] 包含正数。

我的问题是：我应该如何改变快速算法工作也当 Y [] 包含正面和负数？这甚至可能？

修改

和下面的（暂定） O（N）办法code （我在研发写了这个为了方便，但我的语言无关）

  tryA＆LT;  - 地板（1+（N-1）/ 2 + 1）
tryB＆其中;  - 地板（1+（N-1）/ 2）
梅达＆LT; -y [tryA]
梅德布＆LT; -y [tryB]
为（i的1：（tryA-1））{
        a_fast [1]  - ; -medA + Y [i]于
}
对（我在tryA：N）{
        a_fast [1]  - ; -medB + Y [i]于
}

简单的例子：

简单的，说明性的例子。如果我们有长度为4的矢量

  -3，-1，2，4

然后，例如对于i = 1时，3绝对成对和数是

  4 1 1

和他们的中位数为1。

然后，例如对于i = 2，3绝对成对和数是

  4 1 3

和他们的中位数为3。

下面是一个较长的例子有正反两方面的 Y [] ：

  -1.27 -0.69 -0.56 -0.45 -0.23 0.07 0.13 0.46 1.56 1.72

和这里是我的新 b_slow [] （这是地面thruth，计算用简单的方式）：

  1.20 0.92 1.00 1.01 0.79 0.53 0.56 0.53 1.33 1.49

但现在，我的新 a_fast [] 不匹配，没有更多的：

  -1.20 -0.62 -0.49 -0.38 -0.16 -0.16 -0.10 0.23 1.33 1.49

修改

下面是我实现弗朗西斯的解决方案（最多的地步，我们有两个有序数组，中位数，其中容易计算）。我这样做是在研发停留在问题的精神。

不过，我似乎缺少一个修正系数为索引（在code以下的WW），所以低于code是有时关闭的一点点。这是因为，在上述定义，我们通过n-1个观测计算中位数（ⅰ！= j）条

  N'LT; -100
 Y'LT; -rnorm（N）
 Y'LT; -sort（Y）

 b将-rep（NA，N）
 #Naive --O（N ** 2） -  approch：
 为（i的1：N）{
     B〔1]  - ; -median（ABS（Y [-i] + Y [I]））
 }

 K＆LT; -rep（NA，N）
 I＆LT; -1
 K表[1]  - ; -min（na.omit（C（其中（Y + Y [1]  -  0）[1]，N）））#binary搜索：O（日志（N）） - 
 为（在我2：N）{#O（N）
     k_prov＆其中; -k [I-1]
     而（Y [k_prov] + Y [i]于大于0＆安培;＆安培; k_prov大于0）k_prov＆所述; -k_prov -1-
     K表[1]  - ; -max（k_prov + 1,1）
     #for（ⅰ在1：n）的{应当给出相同的结果。
     ＃k中[1]  - ;华征信（Y +值Y [i]＆0）[1]
     ＃}
 }

 I＆LT; -sample（1：N，1）
 X1＆LT;  -  Y [1：（K [I] -1）]  -  Y [I]
 X2＆LT; -y [I] + Y [N：克[我]
 ×3其中-C（X1，X2）
 图（X3）
 WW＆其中; -ifelse（ⅰ＆所述; k [1]  - 安培;我将N / 2，N / 2 + 1，N / 2）
 排序（X3）[WW]＃这个可以有效地计算：O（日志（N））
 B〔I]＃这个是为O（n ** 2）的结果。

解决方案

下面是一个O（Nxln（N）XLN（N））解决方案：

对于所有i：

1）找到项目k，使得如 J＆LT; K＆LT; =＆GT; Y [J] + Y [1] - 0 （二分法，O（LN（N）））

K分隔两个的有序列表：1以上-y [I]，其他下面-y [I]的量，符号应改变以获得绝对（值Y [i] + Y [j]的）。现在，我们正在寻找这些列表的中位数。

从这里，它是<公正问题href="http://stackoverflow.com/questions/4607945/how-to-find-the-kth-smallest-element-in-the-union-of-two-sorted-arrays/11698659#11698659">finding两个有序列表中值的，重复n次。

2）让我们挑选最大（M = ABS（Y [1] -y [I]）或M = ABS（Y [大小] -y [I]））和最小（约ķ这些列表的米）并重新启动二分法（O（LN（N））。让我们开始挑选中（M + M）/ 2 ......在任何阶段，让我们挑中间...

3）阶段这个大二分法：有多少项Y [J] + Y [I]高于（M + M）/ 2在第一个列表？再次二分法... O（LN（N））。有多少项目-y [J] -y [I]高于（M + M）/ 2在第二个列表？你猜怎么了？二分法......心这两个数字。如果它是上述（尺寸-1）/ 2，M =（M +米）/ 2。否则，M =（M +米）/ 2

4）在m = M停止！ B [I] =米;

我想有人会配有一个更好的解决办法...

编辑：我要感谢@ user189035为他联系到一个O（LN（N + M））算法来计算两个有序列表中值。 <一href="http://stackoverflow.com/questions/4607945/how-to-find-the-kth-smallest-element-in-the-union-of-two-sorted-arrays/8935157#8935157">How找到两排序数组的工会第k个最小元素？

再见，

I'm trying to come up with a fast algorithm to compute the quantity b[i]= med |y_i+y_j|, 1<=j!=i<=n when the y_1,...,y_n are sorted already (so b[] is a vector of same length as y[]). I will assume that all elements of y[] are unique and that n is even.

So, the code below computes the b[i]'s the naive (O(n**2)) way: (I wrote this in R for convenience, but I'm language agnostic)

n<-30
a_fast<-b_slow<-rep(NA,n)
y<-sort(rnorm(n,100,1))
z<-y
for(i in 1:n){
    b_slow[i]<-median(abs(y[-i]+y[i]))
}

I have a tentative proposal --below-- for doing it in O(n). But it only works if y[] contains positive numbers.

My question is: how should I change the fast algorithm to also work when y[] contains both positive and negative numbers? Is this even possible?

EDIT:

And the code below the (tentative) O(n) way (I wrote this in R for convenience, but I'm language agnostic)

tryA<-floor(1+(n-1)/2+1)
tryB<-floor(1+(n-1)/2)
medA<-y[tryA]
medB<-y[tryB]
for(i in 1:(tryA-1)){
        a_fast[i]<-medA+y[i]
}
for(i in tryA:n){
        a_fast[i]<-medB+y[i]
}

Simple example:

Simple, illustrative example. If we have a vector of length 4

-3, -1, 2, 4

Then, for example for i=1, the 3 absolute pairwise sums are

  4 1 1

and their median is 1.

Then, for example for i=2, the 3 absolute pairwise sums are

  4 1 3

and their median is 3.

Here is a longer example with both positive and negative y[]:

 -1.27 -0.69 -0.56 -0.45 -0.23  0.07  0.13  0.46  1.56  1.72

and here are my new b_slow[] (this is the ground thruth, computed the naive way):

1.20 0.92 1.00 1.01 0.79 0.53 0.56 0.53 1.33 1.49

but now, my new a_fast[] don't match no more:

 -1.20 -0.62 -0.49 -0.38 -0.16 -0.16 -0.10  0.23  1.33  1.49

EDIT:

Here is my implementation of Francis's solution (up to the point where we have two sorted array, the median of which is easy to compute). I did it in R to stay in the spirit of the question.

Nonetheless, I seem to be missing a correction factor for the index (the ww in the code below) so the code below is sometimes off by a little bit. This is because in the definition above we compute the medians over n-1 observations (i!=j).

 n<-100
 y<-rnorm(n)
 y<-sort(y)

 b<-rep(NA,n)
 #Naive --O(n**2)-- approch:
 for(i in 1:n){
     b[i]<-median(abs(y[-i]+y[i]))
 }

 k<-rep(NA,n)
 i<-1
 k[i]<-min(na.omit(c(which(y+y[i]>0)[1],n))) #binary search: O(log(n)) -- 
 for(i in 2:n){                  #O(n)
     k_prov<-k[i-1]
     while(y[k_prov]+y[i]>0 && k_prov>0)     k_prov<-k_prov-1
     k[i]<-max(k_prov+1,1)
     #for(i in 1:n){ should give the same result.
     #   k[i]<-which(y+y[i]>0)[1]
     #}
 }

 i<-sample(1:n,1)
 x1<--y[1:(k[i]-1)]-y[i]
 x2<-y[i]+y[n:k[i]]
 x3<-c(x1,x2)
 plot(x3)
 ww<-ifelse(i<k[i] & i>n/2,n/2+1,n/2)
 sort(x3)[ww]  #this can be computed efficiently: O(log(n))
 b[i]          #this is the O(n**2) result.

解决方案

Here is a O(Nxln(N)xln(N)) solution :

for all i :

1) find item k such as j<k <=> y[j]+y[i]<0 (dichotomy, O(ln(N)))

k separates two sorted lists : one above -y[i], the other below -y[i], for which the sign should be changed to get abs(y[i]+y[j]). Now, we are looking for the median of these lists.

From here, it is just the problem of finding the median of two sorted lists, repeated n times.

2)Let's pick the maximum (M=abs(y[1]-y[i]) or M=abs(y[size]-y[i])) and minimum (m around k) of these lists and restart a dichotomy (O(ln(N)). Let's start by picking the middle (M+m)/2...at any stage, let pick the middle...

3)Stage of this big dichotomy : How many items y[j]+y[i] are above (M+m)/2 in the first list ? Once again a dichotomy... O(ln(N)). How many items -y[j]-y[i] are above (M+m)/2 in the second list ? Guess what ? Dichotomy... Sum these two numbers. If it is above (size-1)/2, m=(M+m)/2. Otherwise M=(M+m)/2.

4)If m=M stop ! b[i]=m;

I guess somebody will come with a better solution...

Edit : I should thank @user189035 for his link to an O(ln(n+m)) algorithm to compute the median of two sorted lists. How to find the kth smallest element in the union of two sorted arrays?

Bye,

这篇关于高效的算法来计算一个排序的数组的pariwise绝对金额的中位数的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

高效的算法来计算一个排序的数组的pariwise绝对金额的中位数 [英] Efficient algorithm to compute the median of pariwise absolute sums of a sorted array

问题描述

修改

简单的例子：

修改

EDIT:

Simple example:

EDIT:

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

高效的算法来计算一个排序的数组的pariwise绝对金额的中位数 [英] Efficient algorithm to compute the median of pariwise absolute sums of a sorted array

问题描述

修改

简单的例子：

修改

EDIT:

Simple example:

EDIT:

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭