如果/否则:仅在R中不满足第一个条件后,才在设置的距离内选择第一个匹配记录 [英] If/else if: pick first matching record within set distance only after first condition is not met in R
问题描述
我想仅在不满足第一个搜索条件后,才在设定的距离内选择最近的所有者。这些位置称为 reflo
(参考位置),它们具有相应的x和y坐标(称为 locx
和 locy
)。
I would like to pick the closest previous owner within a set distance only after the first search condition isn't met. The locations are called reflo
(reference location), and they have a corresponding x and y coordinates (called locx
and locy
, respectively).
条件:
- 如果
lifetime_census $ reflo == owners $ reflo.x [i]
,则满足条件 - 如果
lifetime_census $ reflo!= owners $ reflo.x [i]
,则查找下一个最近的记录(30米以内) - 如果30米之内没有记录,则分配
NA
- if
lifetime_census$reflo==owners$reflo.x[i]
then condition is met - if
lifetime_census$reflo!=owners$reflo.x[i]
, then find next closest record (within 30 meters) - if there is no record within 30 meters, then assign
NA
以前的所有者(> 20,000)存储在名为 lifetime_census
的数据集中。以下是数据示例:
Previous owners (>20,000) are stored in a dataset called lifetime_census
. Here is a sample of the data:
id previous_id reflo locx locy lifespan
16161 5587 -310 -3 10 1810
16848 5101 Q1 17.3 0.8 55
21815 6077 M2 13 1.8 979
23938 6130 -49 -4 9 374
29615 7307 B.1 2.5 1 1130
然后我有一个所有者
数据集(这里是一个示例):
I then have an owners
dataset (here is a sample):
squirrel_id spr_census reflo.x spring_locx spring_locy
6391 2005 M3 13 2.5
6130 2005 -310 -3 10
23586 2019 B9 2 9
为说明我要实现的目标:
squirrel_id spr_census reflo.x spring_locx spring_locy previous_owner
6391 2004 M3 13 2.5 6077
6130 2005 -310 -3 10 5587
23586 2019 B9 2 9 NA
我目前正在尝试的是这样:
What I have currently tried is this:
n <- length(owners$squirrel_id)
distance <- 30 #This can be easily changed to bigger or smaller values
for(i in 1:n) {
last_owner <- subset(lifetime_census,
lifetime_census$reflo==owners$reflo.x[i] & #using the exact location
((30*owners$spring_locx[i]-30* lifetime_census$locx)^2+(30* owners$spring_locy[i]-30* lifetime_census$locy)^2<=(distance)^2)) #this sets the search limit
owners[i,"previous_owner"] <- last_owner$previous_id[i]
}
我无法弄清楚循环如何依次处理条件,然后选择
有什么想法吗?
推荐答案
我建议这样(假设 locx
等单位与距离$ c相同) $ c>:
I would suggest something like this (asumming the units for locx
and alike are the same as for distance
:
distance = 30
distance_xy = function (x1, y1, x2, y2) {
sqrt((x2 - x1)^2 + (y2 -y1)^2)
}
for (i in 1:dim(owners)[1]) {
if (owners$reflo.x[i] %in% lifetime_census$reflo) {
owners$previous_owner[i] = lifetime_census[lifetime_census$reflo == owners$reflo.x[i], ]$previous_id
} else {
dt = distance_xy(owners$spring_locx[i], owners$spring_locy[i], lifetime_census$locx, lifetime_census$locy)
if (any(dt <= distance)) {
owners$previous_owner[i] = lifetime_census[order(dt), ]$previous_id[1L]
} else {
owners$previous_id[i] = NA
}
}
}
给出:
squirrel_id spr_census reflo.x spring_locx spring_locy previous_owner
1 6391 2005 M3 13 2.5 6077
2 6130 2005 -310 -3 10.0 5587
3 23586 2019 B9 2 9.0 5587
请注意,如果 reflo
有多个匹配项,则此操作将失败。
Note that this will fail if there are more than one match for reflo
.
根据以下评论添加替代项。
Adding an alternative based on comment below.
if 当您开始添加条件时,code>-<
else
语句会变得非常混乱。这是避免出现上述嵌套结构的另一种方法:
if
-else
statements can get pretty confusing when you start adding conditions. This is another way of achieving the same while avoiding the nested structure above:
for (i in 1:dim(owners)[1]) {
# if we find the reflo
if (owners$reflo.x[i] %in% lifetime_census$reflo) {
owners$previous_owner[i] = lifetime_census[lifetime_census$reflo == owners$reflo.x[i], ]$previous_id
next
}
# if we got here, then we didn't find the reflo, compute distances:
dt = distance_xy(owners$spring_locx[i], owners$spring_locy[i], lifetime_census$locx, lifetime_census$locy)
# if we find anyone within distance, get the closest one
if (any(dt <= distance)) {
owners$previous_owner[i] = lifetime_census[order(dt), ]$previous_id[1L]
next
}
# if we got here, there was nobody within range, set NA and move on:
owners$previous_id[i] = NA
}
代码的功能完全相同,只是利用了 for
循环和 next
,可以删除每个 else
并嵌套的孔结构体。
The code does exactly the same, but by taking advantage of the for
loop and next
it is possible to remove every else
and the hole nested structure.
这篇关于如果/否则:仅在R中不满足第一个条件后,才在设置的距离内选择第一个匹配记录的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!