通过两个最近的变量合并data.table [英] Merge data.table by two nearest variables
问题描述
DT1(x_j)^ 2 +(y_i-y_j)^ 2] = data.table(x = 1:5,y = 3:7)
DT2 = data.table(x = c(2,4,2,3,6),y = c(2.5,3.1, 2,3,5),Q = c('a','b','c','d','e'))
然后,合并的期望结果是:
xy Q
1:1 3 a
2:2 4 d
3:3 5 d
4:4 6 e
5:5 7 e
我当然可以在DT1上写一个循环来计算DT1中每一行的最近邻,然后根据这个计算进行合并,似乎破坏了数据表的目的,此外,对于几百万行的数据表,这将是非常缓慢的。
我知道对于一个列,我可以做一个最近邻居合并像这样
DT2 [DT1,roll =nearest]
但是,当我为要合并的表定义2个键(x和y)时(逻辑上)不起作用。 2参数最近邻居合并的类似语法是否存在?如果没有,是否有一个更聪明的方式做这只是循环,像我提到的?
一种可能的解决方案:
func = function(u,v)
{
vec = with(DT2,(ux)^ 2 +(vy)^ 2)
DT2 [which.min ,] $ Q
}
transform(DT1,Q = apply(DT1,1,function(u)func(u [1],u [2])))
#xy Q
#1:1 3 a
#2:2 4 d
#3:3 5 d
#4:4 6 e
#5:5 7 e
I have two data tables with x,y coordinates and some other info which I would like to merge based on nearest neighbour distance, i.e. on the minimum in squared difference of both x and y (dx_i =min ([(x_i-x_j)^2+(y_i-y_j)^2]^0.5). Say I have the following two sets:
DT1=data.table(x=1:5,y=3:7)
DT2=data.table(x=c(2,4,2,3,6),y=c(2.5,3.1,2,3,5),Q=c('a','b','c','d','e'))
Then the desired result of the merge would be:
x y Q
1: 1 3 a
2: 2 4 d
3: 3 5 d
4: 4 6 e
5: 5 7 e
I could of course write a loop over DT1 to calculate the nearest neighbour for each row in DT1 and then merge based on this calculation, but that seems to defeat the purpose of data tables. Moreover, that will be very slow for data tables of several million rows.
I know that for a single column I could do a nearest neighbour merge like this
DT2[DT1,roll="nearest"]
But that (logically) doesn't work when I define 2 keys (x and y) for the tables to be merged. Does a similar syntax for a 2-parameter nearest neighbour merge exist? If not, is there a smarter way to do this then just looping, like I mentioned?
One possible solution:
func = function(u,v)
{
vec = with(DT2, (u-x)^2 + (v-y)^2)
DT2[which.min(vec),]$Q
}
transform(DT1, Q=apply(DT1, 1, function(u) func(u[1], u[2])))
# x y Q
#1: 1 3 a
#2: 2 4 d
#3: 3 5 d
#4: 4 6 e
#5: 5 7 e
这篇关于通过两个最近的变量合并data.table的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!