如何合并两个data.table不同的列名？ [英] How to merge two data.table by different column names?

查看：513 发布时间：2017/3/12 10:39:46 r merge data.table

本文介绍了如何合并两个data.table不同的列名？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有两个data.table X 和 Y 。

I have two data.table X and Y.

strong>：区域，ID，值

Y 列： ID，

columns in X: area, id, value
columns in Y: ID, price, sales

创建两个data.tables：

Create the two data.tables:

X = data.table(area=c('US', 'UK', 'EU'),
               id=c('c001', 'c002', 'c003'),
               value=c(100, 200, 300)
              )

Y = data.table(ID=c('c001', 'c002', 'c003'),
               price=c(500, 200, 400),
               sales=c(20, 30, 15)
              )

我为 X 和 Y 设置键：

setkey(X, id)
setkey(Y, ID)

现在，我尝试通过 X 和加入 id 加入 X Y 中的code> ID ：

Now I try to join X and Y by id in X and ID in Y:

merge(X, Y)
merge(X, Y, by=c('id', 'ID'))
merge(X, Y, by.x='id', by.y='ID')

所有引发的错误表示 by 参数无效。

All raised error saying that column names in the by argument invalid.

我参考data.table的手册，发现 merge 支持 by.x 和 by.y 参数。

I referred to the manual of data.table and found the merge function not supporting by.x and by.y arguments.

附加：如何通过不同的列名称连接两个数据表<

我设法通过 X [Y] 连接两个表，但为什么合并函数在data.table失败？

Append:
I managed to join the two tables by X[Y], but why merge function fails in data.table?

推荐答案

使用此操作：

X[Y] # area id value price sales # 1: US c001 100 500 20 # 2: UK c002 200 200 30 # 3: EU c003 300 400 15

或此操作：

Y[X] # ID price sales area value # 1: c001 500 20 US 100 # 2: c002 200 30 UK 200 # 3: c003 400 15 EU 300

编辑您的问题，我阅读了常见问题的第1.12节：什么是X [Y]和合并（X，Y）之间的差异？，这导致我checkout ？merge ，我发现有两个不同的合并函数，您正在使用的包。默认为 merge.data.frame ，但data.table使用 merge.data.table 。比较

Edit after you edited your question, I read Section 1.12 of the FAQ: "What is the didifference between X[Y] and merge(X,Y)?", which led me to checkout ?merge and I discovered there are two different merge functions depending upon which package you are using. The default is merge.data.frame but data.table uses merge.data.table. Compare

merge(X, Y, by.x = "id", by.y = "ID") # which is merge.data.table # Error in merge.data.table(X, Y, by.x = "id", by.y = "ID") : # A non-empty vector of column names for `by` is required.

merge.data.frame(X, Y, by.x = "id", by.y = "ID") # id area value price sales # 1 c001 US 100 500 20 # 2 c002 UK 200 200 30 # 3 c003 EU 300 400 15

根据评论编辑完整性团队正在计划实施 by.x 和<@ c $ c> code> into.y 到 merge.data.table 函数中，但尚未这样做。

Edit for completeness based upon a comment by @Michael Bernsteiner, it looks like the data.table team is planning on implementing by.x and by.y into the merge.data.table function, but hasn't done so yet.

这篇关于如何合并两个data.table不同的列名？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何合并两个data.table不同的列名？ [英] How to merge two data.table by different column names?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何合并两个data.table不同的列名？ [英] How to merge two data.table by different column names?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭