R data.table join:SQL“select *”类似的语法在连接表? [英] R data.table join: SQL "select *" alike syntax in joined tables?
问题描述
我有两个data.tables有许多字段。
我想连接两个表,添加一些计算字段,并追加所有其他字段从第一,第二或者两个表(类似于SQL select a + b AS sum,DT1。*,DT2。* FROM ...
简化的示例数据:
library(data.table)
DT1 = data.table(x = c(c,a,b a,b),a = 1:5)
DT2 = data.table(x = c(d,c,b),b = 6:8)
现在我想加入表格并添加一个计算字段:
DT1 [DT2,。(sum = a + b,<<如何说DT1。*,DT2。* here?> ),on =x]
受用户jangorecki的启发我找到了一个功能请求:
它使用非常强大的R功能,称为计算(或元编程),在官方 R语言定义手册。这是R语言的一个特殊功能,不应忘记IMO。
library(data.table)
DT1 = data.table(x = c(c,a,b,a,b),a = 1:5 )
DT2 = data.table(x = c(d,c,b),b = 6:8)
jj = as.call
list(as.name(。)),
list(sum = quote(a + b)),
lapply ))),as.name)
))
print(jj)
#(sum = a + b,x,a,b)
DT1 (jj),on =x]
#sum xab
#1:NA d NA 6
#2:8 c 1 7
#3:11 b 3 8
#4:13 b 5 8
I have two data.tables with many fields.
I want to join the two tables, add some calculated fields and append all other fields from the first, second or both tables (similar to SQL's select a+b AS sum, DT1.*, DT2.* FROM...
) without typing all the field names.
How can I do this (regarding easiest syntax and best performance)?
Simplified example data:
library(data.table)
DT1 = data.table(x=c("c", "a", "b", "a", "b"), a=1:5)
DT2 = data.table(x=c("d", "c", "b"), b=6:8)
Now I want to join the tables and add a calculated field:
DT1[DT2, .(sum=a + b, <<< how to say DT1.*, DT2.* here? >>> ), on="x"]
Update May 4, 2016: Inspired by user jangorecki I have found a feature request for this:
Should be able to refer to i's .SD during a join
This should precisely answer your need.
It uses very powerful R feature called computing on the language (or meta programming) well described in official R Language Definition manual. This is an exceptional feature of R language and should not be forgotten IMO.
library(data.table)
DT1 = data.table(x=c("c", "a", "b", "a", "b"), a=1:5)
DT2 = data.table(x=c("d", "c", "b"), b=6:8)
jj = as.call(c(
list(as.name(".")),
list(sum = quote(a+b)),
lapply(unique(c(names(DT1), names(DT2))), as.name)
))
print(jj)
#.(sum = a + b, x, a, b)
DT1[DT2, eval(jj), on="x"]
# sum x a b
#1: NA d NA 6
#2: 8 c 1 7
#3: 11 b 3 8
#4: 13 b 5 8
这篇关于R data.table join:SQL“select *”类似的语法在连接表?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!