如何做一个基本的左外连接与R中的data.table？ [英] How to do a basic left outer join with data.table in R?

查看：86 发布时间：2017/3/12 11:58:36 sql r data.table

本文介绍了如何做一个基本的左外连接与R中的data.table？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有a和b的data.table，我已经划分到下面和b< .5和上面（b> .5）：

I have a data.table of a and b that I've partitioned into below with b < .5 and above with b > .5:

DT = data.table(a=as.integer(c(1,1,2,2,3,3)), b=c(0,0,0,1,1,1))
above = DT[DT$b > .5]
below = DT[DT$b < .5, list(a=a)]

我想做一个左外连接以上和下面：每个 a $ c>上面，计算下面的行数。这等同于SQL中的以下内容：

I'd like to do a left outer join between above and below: for each a in above, count the number of rows in below. This is equivalent to the following in SQL:

with dt as (select 1 as a, 0 as b union select 1, 0 union select 2, 0 union select 2, 1 union select 3, 1 union select 3, 1),
  above as (select a, b from dt where b > .5),
  below as (select a, b from dt where b < .5)
select above.a, count(below.a) from above left outer join below on (above.a = below.a) group by above.a;
 a | count 
---+-------
 3 |     0
 2 |     1
(2 rows)

如何用data.tables完成同样的事情？这是我到目前为止所尝试的：

How do I accomplish the same thing with data.tables? This is what I tried so far:

> key(below) = 'a'
> below[above, list(count=length(b))]
     a count
[1,] 2     1
[2,] 3     1
[3,] 3     1
> below[above, list(count=length(b)), by=a]
Error in eval(expr, envir, enclos) : object 'b' not found
> below[, list(count=length(a)), by=a][above]
     a count b
[1,] 2     1 1
[2,] 3    NA 1
[3,] 3    NA 1

我还应该更具体， code> merge 但是，我的系统上的内存（和数据集只占大约20％的内存）。

I should also be more specific in that I already tried merge but that blows through the memory on my system (and the dataset takes only about 20% of my memory).

推荐答案

看看这是给你一些有用的东西。你的示例太稀疏，让我知道你想要什么，但它似乎可能是上面$ a 的值的表格，也在 $ a以下$ a以下$ a以下$ a以下$ a


See if this is giving you something useful. Your example is too sparse to let me know what you want, but it appears it might be a tabulation of values of above$a that are also in below$a
table(above$a[above$a %in% below$a])

如果您还希望 / code>，那么这样做：

If you also want the converse  ... values not in below, then this would do it:
table(above$a[!above$a %in% below$a])

您可以将它们连接起来：
And you can concatenate them:
> c(table(above$a[above$a %in% below$a]),table(above$a[!above$a %in% below$a]) )
2 3 
1 2

通常表和％in％以相当小的足迹运行，速度很快。
Generally table and %in% run in reasonably small footprints and are quick.

                        这篇关于如何做一个基本的左外连接与R中的data.table？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

如何做一个基本的左外连接与R中的data.table？ [英] How to do a basic left outer join with data.table in R?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何做一个基本的左外连接与R中的data.table？ [英] How to do a basic left outer join with data.table in R?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭