子集邻居文件 [英] Subsetting neighboring fileds

查看:88
本文介绍了子集邻居文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试创建一个条件子集,该子集将在相邻区域的窗口内包含现代元素. 例如,给定矩阵Dat,其中物种(SP),面积(AR)和时间(TM):

I am trying to make a conditional subsetting which would include contemporary elements inside window of neighboring areas. For example, given the matrix Dat, where Species (SP), Area (AR) and Time (TM):

SP AR TM
A  2  2
B  2  2
C  1  4
F  3  2
B  5  3
E  3  2
D  2  1
I  1  4
H  3  2
E  2  4
D  3  5
B  1  2

我如何才能同时在相邻区域(在本例中为1和3)内检索与物种A同时存在的所有物种?所需的输出为:

How can I retrieve all the species co-occurring with species A in the same time, but within neighboring areas (in this case 1 and 3)? The desired output being:

SP  AR  TM
A  2  2
B  2  2
F  3  2
H  3  2
B  1  2

这是基于以下假设:物种A将在数据集中的不同区域重复出现. 我有一次尝试,是由用户thelatemail通过我之前发布的另一个问题(带有相关元素)进行的,但进行了一些小修改.添加的X表示我无法弄清楚的语法部分,该部分基本上是括号的定义(假定它应该去的位置或多或少).

This is based on the assumption that Species A will be occurring repeatedly in the dataset in different areas. I have an attempt, given by user thelatemail from a different question (with related elements) I posted previously, with minor modifications. The X's added, indicate the part of the syntax I cannot figure out, which would basically be the definition of the bracket (given this is more or less where it should be going).

with(dat,dat[
  apply(
    sapply(TM[SP=="A"],
    function(x) abs(AR)XXXXX),1,any
  )
,]
)

非常感谢您的帮助.

这是我要对大型数据集进行的一组操作的一部分.我将这个问题分为两个要素,这两个要素有些相关,但绝非重复.原因是我是R语言的入门用户,想学习如何自行解释,编写和集成代码.指向相关问题的链接:在时间窗口内基于共现的子集.如果需要,我可以删除其中一个链接.

This is part of a set of operations I am trying to make on a large dataset. I broke the question into two elements, which are somewhat related but far from being duplicates. The reason for this being I am a beginning R user and want to learn how to interpret, write and integrate codes on my own. The link to the related question: Subsetting based on co-occurrence within a time window . I can remove one of the links if needed.

推荐答案

假定相邻区域由给定区域+/- 1之内的区域定义:

Assuming neighboring areas are defined by areas within +/- 1 of the given area:

从上一个问题中复制:基于共在一个时间窗口内发生 ...

Copying from the previous question: Subsetting based on co-occurrence within a time window ...

with(dat,dat[
  (
    SP=="A" |
    # Area %in% Area[SP=="A"]
    Area %in% c(Area[SP=='A']-1, Area[SP=='A'], Area[SP=='A']+1)
  ) & 
  apply(
    sapply(Time[SP=="A"],
    function(x) abs(difftime(Time,x,units="mins"))<=30 ),1,any
  ) 
,]
)


根据评论(和我的评论)


As per the comment (and my comment)

area <- 2
dat[dat$AR %in% c(area - 1, area, area + 1),]


关于根据POSIXct间隔进行的条件子设置而另一个包含时间间隔的字段删除了SP =='A'的条件应会导致正确的子设置


And in Regards to Conditional subsetting by POSIXct interval and another field containing interval removing the conditional for SP=='A' should result in correct subsetting

area_boolean <- with(dat, Area %in% c(Area[SP=='A']-1, Area[SP=='A'], Area[SP=='A']+1))
time_boolean <- with(dat, apply(sapply(Time[SP=="A"],
                                function(x) abs(difftime(Time, x, units="mins")) <= 30 ),  
                                1, 
                                any))
dat[area_boolean & time_boolean,]

这篇关于子集邻居文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆