data.table聚合到列表列 [英] data.table aggregation to list column

查看:87
本文介绍了data.table聚合到列表列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试从data.table中聚合数据,以创建一个新列,该列是以前行的列表。通过示例更容易看到:

  dt < -  data.table(id = c ,2,2,3,3,3),letter = c('a','a','b','c','a','c','b','b' '))

我想以这样一种方式聚合这个结果:

  id letter 
1:1 a,a,b,c
2:2 a,c
3:3 b,b,a



直观地我尝试了

  dt [,j = list(list(letter)),by = id] 

但这不工作。奇怪的是,当我逐个案例,例如:

 > dt [id == 1,j = list(list(letter)),by = id] 

id V1
1:1 a,a,b,c

结果很好...我觉得我缺少一个 .SD 某处或类似的东西...



任何人都能指向正确的方向?



谢谢!

解决方案

更新:行为 DT [,list(list(。)),由=。] 有时会导致R version> = 3.1.0中的错误结果。现在,在提交#1280 中修正了 data.table v1.9.3。从新闻



  • DT [,list(list(。)),by =。] 返回正确的结果在R> = 3.1.0。该错误是由于R v3.1.0中最近(欢迎)更改,其中 list(。)不会导致复制。关闭#481


使用此更新,不再需要 I()。你可以这样做: DT [,list(list(。)),by =。]






这似乎是一个类似的问题,像已知的错误#5585 。在你的情况下,我想你可以使用

  dt [,粘贴(letter,collapse =,) id] 



正如@ilir指出,如果实际上希望获取列表(而不是显示的字符),可以使用错误报告中建议的解决方法: / p>

  dt [,list(list(I(letter))),by = id] 


I'm trying to aggregate a data from a data.table to create a new column which is a list of previous rows. It's easier to see by example:

dt <- data.table(id = c(1,1,1,1,2,2,3,3,3), letter = c('a','a','b','c','a','c','b','b','a'))

I would like to aggregate this in such a ways that the result should be

   id  letter
1:  1 a,a,b,c
2:  2     a,c
3:  3   b,b,a  

Intuitively I tried

dt[,j = list(list(letter)), by = id]

but that doesn't work. Oddly enough when I go case by case, for example:

> dt[id == 1,j = list(list(letter)), by = id]

   id      V1
1:  1 a,a,b,c

the result is fine... I feel like I'm missing an .SD somewhere or something like that...

Can anybody point me in the right direction?

Thanks!

解决方案

Update: The behaviour DT[, list(list(.)), by=.] sometimes resulted in wrong results in R version >= 3.1.0. This is now fixed in commit #1280 in the current development version of data.table v1.9.3. From NEWS:

  • DT[, list(list(.)), by=.] returns correct results in R >=3.1.0 as well. The bug was due to recent (welcoming) changes in R v3.1.0 where list(.) does not result in a copy. Closes #481.

With this update, it's not necessary for I() anymore. You can just do: DT[, list(list(.)), by=.] as before.


This seems to be a similar issue as the known bug #5585. In your case, I think you could just use

dt[, paste(letter, collapse=","), by = id] 

to fix your problem.

As @ilir pointed out, if it is actually desirable to get a list (rather than the displayed character), you could use the workaround suggested in the bug report:

dt[, list(list(I(letter))), by = id]

这篇关于data.table聚合到列表列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆