R:data.table 交叉连接不起作用 [英] R: data.table cross-join not working

查看:26
本文介绍了R:data.table 交叉连接不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个 data.table 我想加入(形成笛卡尔积).data.table 之一以 Date 向量为键,另一个以 numeric 向量为键:

I have two data.tables that I want to join (form a Cartesian product of). One of the data.tables is keyed on a Date vector, and the other on a numeric vector:

# data.table with dates (as numeric)
dtDates2 = data.table(date = 
                       as.numeric(seq(from = as.Date('2014/01/01'), 
                           to = as.Date('2014/07/01'), by = 'weeks')),
                     data1 = rnorm(26))

# data.table with dates
dtDates1 = data.table(date = 
                        seq(from = as.Date('2014/01/01'), 
                            to = as.Date('2014/07/01'), by = 'weeks'),
                      data1 = rnorm(26))


# data.table with customer IDs
dtCustomers = data.table(customerID = seq(1, 100),
                      data2 = rnorm(100))

I setkey 并尝试使用 CJ 交叉连接它们:

I setkey and try to cross-join them using CJ:

# cross join the two datatables
setkey(dtCustomers, customerID)
setkey(dtDates1, date)
setkey(dtDates2, date)

CJ(dtCustomers, dtDates1)
CJ(dtCustomers, dtDates2)

但得到以下错误:

Error in FUN(X[[1L]], ...) : 
  Invalid column: it has dimensions. Can't format it. If it's the result of data.table(table()), use as.data.table(table()) instead.

不确定我做错了什么.

推荐答案

data.table 中没有现成可用的交叉联接功能.
然而,在 optiRum 中有 CJ.dt 函数(类似于 CJ 但专为 data.tables 设计)来实现笛卡尔积(交叉连接)包(在 CRAN 中可用).
您可以创建函数:

There is no cross join functionality available in data.table out of the box.
Yet there is CJ.dt function (a CJ like but designed for data.tables) to achieve cartesian product (cross join) available in optiRum package (available in CRAN).
You can create the function:

CJ.dt = function(X,Y) {
  stopifnot(is.data.table(X),is.data.table(Y))
  k = NULL
  X = X[, c(k=1, .SD)]
  setkey(X, k)
  Y = Y[, c(k=1, .SD)]
  setkey(Y, NULL)
  X[Y, allow.cartesian=TRUE][, k := NULL][]
}
CJ.dt(dtCustomers, dtDates1)
CJ.dt(dtCustomers, dtDates2)

然而,在 data.table#1717 中填充了一个 FR 来方便地执行交叉连接,所以你可以检查那里是否有更好的交叉连接api.

Yet there is a FR for convenience way to perform cross join filled in data.table#1717, so you could check there if there is a nicer api for cross join.

这篇关于R:data.table 交叉连接不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆