基于向量键合并数据框 [英] Merge data frame based on vector key
问题描述
我是一个绝对的初学者,希望有人能够帮助我解决一个合并问题,我今晚大部分时间都在解决这个问题,到目前为止,我一直无法成功地调整解决类似问题的解决方案具体例子.
I'm an absolute beginner and am hoping someone will be able to help me with a merge problem that I've been stuck on for most of this evening and have thus far been unable to successfully adapt solutions to similar problems to this particular example.
我制作了一个虚拟数据框和矢量来帮助说明我的问题:
I've made a dummy data frame and vector to help illustrate my problem:
dumdata <- data.frame(id=c(1:5), pcode=c(1234,9876,4477,2734,3999), vlo=c(100,450,1000,1325,1500), vhi=c(300,950,1100,1450,1700))
id pcode vlo vhi
1 1234 100 300
2 9876 450 950
3 4477 1000 1100
4 2734 1325 1450
5 3999 1500 1700
vkey <- c(105,290,513,1399,1572,1683)
在 vkey 的值落在变量 vlo 和 vhi 之间的情况下,我想输出一个包含 dumdata 数据的新数据帧.在实践中,vkey 的值总是会落在一个 vlo-vhi 范围之间,而且这个范围总是离散的.
I would like to output a new dataframe that contains the data of dumdata in the cases where the value of vkey falls between the variables vlo and vhi. In practice, the value of vkey will always fall between a vlo-vhi range, and the ranges are always discrete.
所需的输出如下所示:
id pcode vlo vhi vkey
1 1234 100 300 105
1 1234 100 300 290
2 9876 450 950 513
4 2734 1325 1450 1399
5 3999 1500 1700 1572
5 3999 1500 1700 1683
推荐答案
您可以使用 sapply
一次性构造整个索引向量,而不是使用 for
循环.
Rather than using for
loops, you can construct the whole index vector in one go with sapply
.
ind <- sapply(vkey, function(x) which(dumdata$vlo < x & x < dumdata$vhi))
data.frame(dumdata[ind,], vkey)
id pcode vlo vhi vkey
1 1 1234 100 300 105
1.1 1 1234 100 300 290
2 2 9876 450 950 513
4 4 2734 1325 1450 1399
5 5 3999 1500 1700 1572
5.1 5 3999 1500 1700 1683
如果 vkey
中的任何值与 dumdata
中的多行匹配,它会变得更丑,因为您需要使用 lapply
而不是 sapply然后做
If any value in vkey
matches multiple lines in dumdata
it gets uglier though, as you'll need to use lapply
instead of sapply and then do
data.frame(dumdata[unlist(ind),], rep(vkey, sapply(vkey, length)))
返回所有匹配项,但我从不会发生的示例中得出结论.
to return all matches, but I take it from the example that it is not going to happen.
为了完整起见,我还要补充一点,您也可以使用 mapply
,但这主要用于需要与多个变量进行比较的情况(例如,如果您有 vkey1
和 vkey2
需要一起满足条件).
For completeness I'll add that you can use mapply
too, but this is mainly intended for the case when you need to make comparisons with more than one variable (like if you had vkey1
and vkey2
that need to fullfill a condition together).
ind <- mapply(function(x, y) which(dumdata$vlo < x & y < dumdata$vhi),
vkey1, vkey2)
这篇关于基于向量键合并数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!