数据帧查找值在范围内并返回不同的列 [英] data frame lookup value in range and return different column
问题描述
DF1 $ pos
)来搜索DF2(DF2start,DF2end)中的两列,如果它下降在这些数字内,返回 DF2 $ name
DF1
ID pos name
chr 12
chr 542
chr 674
DF2
ID开始结束注释
chr 1 200 a1
chr 201 432 a2
chr 540 1002 a3
chr 2000 2004 a4
所以在这个例子中,我希望DF1成为
ID pos name
pre>
chr 12 a1
chr 542 a3
chr 674 a3
我尝试使用合并和相交,但不知道如何使用
如果
语句具有逻辑表达式
数据框应如下编码,
DF1< - data.frame(ID = c(chr,chr,chr),
pos = c(12,542,672),
name = c(NA,NA,NA))
DF2 < - data.frame(ID = c(chr,chr,chr,chr),
start = c(1,201,540,200),
end = c(200,432,1002,2004),
annot = c(a1,a2,a3,a4))
解决方案也许您可以使用
foverlaps
$
library(data.table)
DT1< - 数据表(DF1)
DT2 < - data.table(DF2)
setkey(DT2,ID,开始,结束)
DT1 [,c(开始,结束):= pos] ##我不知道这个步骤有没有办法...
foverlaps(DT1,DT2)
#ID开始结束注释pos i.start i.end
#1:chr 1 200 a1 12 12 12
#2:chr 540 1002 a3 542 542 542
#3:chr 540 1002 a3 674 674 674
foverlaps(DT1,DT2 )[,c(ID,pos,注释),with = FALSE ]
#ID pos注释
#1:chr 12 a1
#2:chr 542 a3
#3:chr 674 a3
正如@Arun在评论中所提到的那样,你也可以使用
在
foverlaps
中提取相关值的TRUEfoverlaps(DT1,DT2,which = TRUE)
#xid yid
#1:1 1
#2:2 3
#3:3 3
DT2 $ annot [foverlaps(DT1,DT2,which = TRUE)$ yid]
#[1]a1a3a3
I have two data frames and wish to use the value in one (
DF1$pos
) to search through two columns in DF2 (DF2start, DF2end) and if it falls within those numbers, returnDF2$name
DF1
ID pos name chr 12 chr 542 chr 674
DF2
ID start end annot chr 1 200 a1 chr 201 432 a2 chr 540 1002 a3 chr 2000 2004 a4
so in this example I would like DF1 to become
ID pos name chr 12 a1 chr 542 a3 chr 674 a3
I have tried using merge and intersect but do not know how to use an
if
statement with a logical expression in them.The data frames should be coded as follows,
DF1 <- data.frame(ID=c("chr","chr","chr"), pos=c(12,542,672), name=c(NA,NA,NA)) DF2 <- data.frame(ID=c("chr","chr","chr","chr"), start=c(1,201,540,200), end=c(200,432,1002,2004), annot=c("a1","a2","a3","a4"))
解决方案Perhaps you can use
foverlaps
from the "data.table" package.library(data.table) DT1 <- data.table(DF1) DT2 <- data.table(DF2) setkey(DT2, ID, start, end) DT1[, c("start", "end") := pos] ## I don't know if there's a way around this step... foverlaps(DT1, DT2) # ID start end annot pos i.start i.end # 1: chr 1 200 a1 12 12 12 # 2: chr 540 1002 a3 542 542 542 # 3: chr 540 1002 a3 674 674 674 foverlaps(DT1, DT2)[, c("ID", "pos", "annot"), with = FALSE] # ID pos annot # 1: chr 12 a1 # 2: chr 542 a3 # 3: chr 674 a3
As mentioned by @Arun in the comments, you can also use
which = TRUE
infoverlaps
to extract the relevant values:foverlaps(DT1, DT2, which = TRUE) # xid yid # 1: 1 1 # 2: 2 3 # 3: 3 3 DT2$annot[foverlaps(DT1, DT2, which = TRUE)$yid] # [1] "a1" "a3" "a3"
这篇关于数据帧查找值在范围内并返回不同的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!