基于向量键合并数据框 [英] Merge data frame based on vector key

查看:30
本文介绍了基于向量键合并数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是一个绝对的初学者,希望有人能够帮助我解决一个合并问题,我今晚大部分时间都在解决这个问题,到目前为止,我一直无法成功地调整解决类似问题的解决方案具体例子.

I'm an absolute beginner and am hoping someone will be able to help me with a merge problem that I've been stuck on for most of this evening and have thus far been unable to successfully adapt solutions to similar problems to this particular example.

我制作了一个虚拟数据框和矢量来帮助说明我的问题:

I've made a dummy data frame and vector to help illustrate my problem:

dumdata <- data.frame(id=c(1:5), pcode=c(1234,9876,4477,2734,3999), vlo=c(100,450,1000,1325,1500), vhi=c(300,950,1100,1450,1700))

id pcode  vlo  vhi
 1  1234  100  300
 2  9876  450  950
 3  4477 1000 1100
 4  2734 1325 1450
 5  3999 1500 1700


vkey <- c(105,290,513,1399,1572,1683)

在 vkey 的值落在变量 vlo 和 vhi 之间的情况下,我想输出一个包含 dumdata 数据的新数据帧.在实践中,vkey 的值总是会落在一个 vlo-vhi 范围之间,而且这个范围总是离散的.

I would like to output a new dataframe that contains the data of dumdata in the cases where the value of vkey falls between the variables vlo and vhi. In practice, the value of vkey will always fall between a vlo-vhi range, and the ranges are always discrete.

所需的输出如下所示:

id   pcode   vlo   vhi  vkey
 1    1234   100   300   105
 1    1234   100   300   290
 2    9876   450   950   513
 4    2734  1325  1450  1399
 5    3999  1500  1700  1572
 5    3999  1500  1700  1683

推荐答案

您可以使用 sapply 一次性构造整个索引向量,而不是使用 for 循环.

Rather than using for loops, you can construct the whole index vector in one go with sapply.

ind <- sapply(vkey, function(x) which(dumdata$vlo < x & x < dumdata$vhi))
data.frame(dumdata[ind,], vkey)

    id pcode  vlo  vhi vkey
1    1  1234  100  300  105
1.1  1  1234  100  300  290
2    2  9876  450  950  513
4    4  2734 1325 1450 1399
5    5  3999 1500 1700 1572
5.1  5  3999 1500 1700 1683

如果 vkey 中的任何值与 dumdata 中的多行匹配,它会变得更丑,因为您需要使用 lapply 而不是 sapply然后做

If any value in vkey matches multiple lines in dumdata it gets uglier though, as you'll need to use lapply instead of sapply and then do

data.frame(dumdata[unlist(ind),], rep(vkey, sapply(vkey, length)))

返回所有匹配项,但我从不会发生的示例中得出结论.

to return all matches, but I take it from the example that it is not going to happen.

为了完整起见,我还要补充一点,您也可以使用 mapply,但这主要用于需要与多个变量进行比较的情况(例如,如果您有 vkey1vkey2 需要一起满足条件).

For completeness I'll add that you can use mapply too, but this is mainly intended for the case when you need to make comparisons with more than one variable (like if you had vkey1 and vkey2 that need to fullfill a condition together).

ind <- mapply(function(x, y) which(dumdata$vlo < x & y < dumdata$vhi),
              vkey1, vkey2)

这篇关于基于向量键合并数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆