用by参数连接data.table [英] Joining data.table with by argument
问题描述
我有两个数据.表 dx
和 dy
dx <- data.table(a = c(1,1,1,1,2,2), b = 3:8)
dy <- data.table(a = c(1,1,2), c = 7:9)
我想将 dy
加入 dx
的每一行,下面是所需的输出
I want to join dy
to each row of dx
, and below is the desired output
data.table(plyr::ddply(dx, c("a", "b"), function(d) merge(d, dy, by = "a")))
a b c
1: 1 3 7
2: 1 3 8
3: 1 4 7
4: 1 4 8
5: 1 5 7
6: 1 5 8
7: 1 6 7
8: 1 6 8
9: 2 7 9
10: 2 8 9
但是,我仅使用 data.table
或 merge
的 []
内部的操作无法输出?我累了
However, I failed to make the output only using operation inside []
of data.table
or merge
? I have tired
merge(dx, dy, by = "a", all = TRUE)
vecseq(f__,len__,if(allow.cartesian || notjoin ||!anyDuplicated(f__,:连接结果为10行;大于9 = nrow(x)+ nrow(i).检查i中是否有重复的键值,每个键值都一遍又一遍地连接到x中的同一组.如果可以,请尝试按= .EACHI为每个组运行j,以避免分配过多.如果确定要继续,请使用allow.cartesian = TRUE重新运行.否则,请在FAQ,Wiki,堆栈溢出和数据表帮助中搜索此错误消息,以获取建议.
dy[dx,on="a"]
vecseq(f__,len__,if(allow.cartesian || notjoin ||!anyDuplicated(f__,:连接结果为10行;大于9 = nrow(x)+ nrow(i).检查i中是否有重复的键值,每个键值都一遍又一遍地连接到x中的同一组.如果可以,请尝试按= .EACHI为每个组运行j,以避免分配过多.如果确定要继续,请使用allow.cartesian = TRUE重新运行.否则,请在FAQ,Wiki,堆栈溢出和数据表帮助中搜索此错误消息,以获取建议.
dx[, merge(dy, by = "a"), by = c("a", "b")]
is.data.table(y)中的错误:缺少参数"y",没有默认值
dx[, merge(.SD, dy, by = "a"), by = c("a", "b")]
merge.data.table(.SD,dy,by ="a")中的错误: by
中列出的元素必须是x和y中的有效列名
Error in merge.data.table(.SD, dy, by = "a") :
Elements listed in by
must be valid column names in x and y
我该怎么做呢?
谢谢!
推荐答案
如果我正确理解了您的要求,则可以使用直接合并选项,
If, I understood your requirement correctly, There is a direct merge option that you can use,
dx <- data.table(a = c(1,1,2,2), b = 3:6)
dy <- data.table(a = c(1,1,2), c = 7:9)
merge(x = dx, y = dy, by = "a", all = TRUE)
它给出您所提到的所需输出.如何连接(合并)数据框(内部,外部,左,右)?
It gives your desired output that you mentioned. How to join (merge) data frames (inner, outer, left, right)?
我希望它能消除您的疑问,我很抱歉.
I hope it clears your doubt if not, I am sory.
这篇关于用by参数连接data.table的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!