子集邻居文件 [英] Subsetting neighboring fileds
问题描述
我正在尝试创建一个条件子集,该子集将在相邻区域的窗口内包含现代元素. 例如,给定矩阵Dat,其中物种(SP),面积(AR)和时间(TM):
I am trying to make a conditional subsetting which would include contemporary elements inside window of neighboring areas. For example, given the matrix Dat, where Species (SP), Area (AR) and Time (TM):
SP AR TM
A 2 2
B 2 2
C 1 4
F 3 2
B 5 3
E 3 2
D 2 1
I 1 4
H 3 2
E 2 4
D 3 5
B 1 2
我如何才能同时在相邻区域(在本例中为1和3)内检索与物种A同时存在的所有物种?所需的输出为:
How can I retrieve all the species co-occurring with species A in the same time, but within neighboring areas (in this case 1 and 3)? The desired output being:
SP AR TM
A 2 2
B 2 2
F 3 2
H 3 2
B 1 2
这是基于以下假设:物种A将在数据集中的不同区域重复出现. 我有一次尝试,是由用户thelatemail通过我之前发布的另一个问题(带有相关元素)进行的,但进行了一些小修改.添加的X表示我无法弄清楚的语法部分,该部分基本上是括号的定义(假定它应该去的位置或多或少).
This is based on the assumption that Species A will be occurring repeatedly in the dataset in different areas. I have an attempt, given by user thelatemail from a different question (with related elements) I posted previously, with minor modifications. The X's added, indicate the part of the syntax I cannot figure out, which would basically be the definition of the bracket (given this is more or less where it should be going).
with(dat,dat[
apply(
sapply(TM[SP=="A"],
function(x) abs(AR)XXXXX),1,any
)
,]
)
非常感谢您的帮助.
这是我要对大型数据集进行的一组操作的一部分.我将这个问题分为两个要素,这两个要素有些相关,但绝非重复.原因是我是R语言的入门用户,想学习如何自行解释,编写和集成代码.指向相关问题的链接:在时间窗口内基于共现的子集.如果需要,我可以删除其中一个链接.
This is part of a set of operations I am trying to make on a large dataset. I broke the question into two elements, which are somewhat related but far from being duplicates. The reason for this being I am a beginning R user and want to learn how to interpret, write and integrate codes on my own. The link to the related question: Subsetting based on co-occurrence within a time window . I can remove one of the links if needed.
推荐答案
假定相邻区域由给定区域+/- 1之内的区域定义:
Assuming neighboring areas are defined by areas within +/- 1 of the given area:
从上一个问题中复制:基于共在一个时间窗口内发生 ...
Copying from the previous question: Subsetting based on co-occurrence within a time window ...
with(dat,dat[
(
SP=="A" |
# Area %in% Area[SP=="A"]
Area %in% c(Area[SP=='A']-1, Area[SP=='A'], Area[SP=='A']+1)
) &
apply(
sapply(Time[SP=="A"],
function(x) abs(difftime(Time,x,units="mins"))<=30 ),1,any
)
,]
)
根据评论(和我的评论)
As per the comment (and my comment)
area <- 2
dat[dat$AR %in% c(area - 1, area, area + 1),]
关于根据POSIXct间隔进行的条件子设置而另一个包含时间间隔的字段删除了SP =='A'的条件应会导致正确的子设置
And in Regards to Conditional subsetting by POSIXct interval and another field containing interval removing the conditional for SP=='A' should result in correct subsetting
area_boolean <- with(dat, Area %in% c(Area[SP=='A']-1, Area[SP=='A'], Area[SP=='A']+1))
time_boolean <- with(dat, apply(sapply(Time[SP=="A"],
function(x) abs(difftime(Time, x, units="mins")) <= 30 ),
1,
any))
dat[area_boolean & time_boolean,]
这篇关于子集邻居文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!