根据列名合并 data.tables [英] merging data.tables based on columns names

查看:9
本文介绍了根据列名合并 data.tables的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试与 data.tables 进行一些左连接合并.包装说明引用了

I am trying to do some left-join merges with data.tables. The package description quote that

在所有连接中,列的名称是不相关的;x的键的列是按顺序连接的

In all joins the names of the columns are irrelevant; the columns of x's key are joined to in order

我知道我可以使用 .data.table[data.table:::merge.data.table

I understand that I can use .data.table[ and data.table:::merge.data.table

我想要的是:合并 X 和 Y 指定键(例如基础合并中的 by.x 和 by.y,->为什么要拿走这个?)

What I would like is : merge X and Y specifying the keys (like by.x and by.y in base merge, ->why taking this away ?)

假设我有

DT = data.table(x=rep(c("a","b","c"),each=3),y=c(1,3,6),v=1:9,key="x,y,v")
DT1 = data.frame(x1=c("aa","bb","cc"),y1=c(1,3,6),v1=1:3,key="x1,y1,v1")

我想要这个输出:

#data.table:::merge is masking I don't know how to call the base version of merge anymore
R) {base::merge}(DT,DT1,by.x="y",by.y="y1") 
y x v x1 v1
1 1 a 1 aa  1
2 1 c 7 aa  1
3 1 b 4 aa  1
4 3 a 2 bb  2
5 3 b 5 bb  2
6 3 c 8 bb  2
7 6 b 6 cc  3
8 6 a 3 cc  3
9 6 c 9 cc  3

我很高兴使用 [data.table:::merge 但我想要一个不修改 DT 的选项或 DT1 (例如更改列名并调用合并并将其改回)

I am very happy to use [ or data.table:::merge but I would like an option that do not modify DT or DT1 (like changing the column names and calling merge and changing it back)

推荐答案

更新:data.table v1.9.6(2015 年 9 月 19 日发布)起,merge.data.table() 确实接受并很好地处理参数 by.x=by.y=.这是一个更新的链接,指向下面引用的 FR(现已关闭).

Update: Since data.table v1.9.6 (released September 19, 2015), merge.data.table() does accept and nicely handles arguments by.x= and by.y=. Here's an updated link to the FR (now closed) referenced below.

是的,这是一个尚未实现的功能请求:

Yes this is a feature request not yet implemented :

FR#2033 将 by.x 和 by.y 添加到 merge.data.table

没有什么可以阻止它.只是没有完成的事情.我很少需要 merge 并且更普遍地意识到它的用处很慢.我们在使 merge 性能与 X[Y] 一样快方面取得了良好进展,并且此功能请求处于最高优先级.如果您希望更快地获得它,我们非常欢迎您将这些参数添加到 merge.data.table 并自己提交更改.我们试图将源代码简短并放在一个函数/文件中,因此希望通过查看 merge.data.table 源代码,您可以关注它并了解需要做什么.

There isn't anything preventing it. Just something that wasn't done. I very rarely need merge and was slow to realise its usefulness more generally. We've made good progress in bringing merge performance as fast as X[Y], and this feature request is at the highest priority. If you'd like it more quickly you are more than welcome to add those arguments to merge.data.table and commit the change yourself. We try to keep source code short and together in one function/file, so by looking at merge.data.table source hopefully you can follow it and see what needs to be done.

这篇关于根据列名合并 data.tables的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆