data.table聚合到列表列 [英] data.table aggregation to list column
问题描述
我正在尝试从data.table中聚合数据,以创建一个新列,该列是以前行的列表。通过示例更容易看到:
dt < - data.table(id = c ,2,2,3,3,3),letter = c('a','a','b','c','a','c','b','b' '))
我想以这样一种方式聚合这个结果:
id letter
1:1 a,a,b,c
2:2 a,c
3:3 b,b,a
直观地我尝试了
dt [,j = list(list(letter)),by = id]
但这不工作。奇怪的是,当我逐个案例,例如:
> dt [id == 1,j = list(list(letter)),by = id]
id V1
1:1 a,a,b,c
结果很好...我觉得我缺少一个
.SD
某处或类似的东西...
任何人都能指向正确的方向?
谢谢!
解决方案更新:行为
DT [,list(list(。)),由=。]
有时会导致R version> = 3.1.0中的错误结果。现在,在提交#1280 中修正了 data.table v1.9.3。从新闻:
DT [,list(list(。)),by =。]
返回正确的结果在R> = 3.1.0。该错误是由于R v3.1.0中最近(欢迎)更改,其中list(。)
不会导致复制。关闭#481 。
使用此更新,不再需要
I()
。你可以这样做:DT [,list(list(。)),by =。]
。
这似乎是一个类似的问题,像已知的错误#5585 。在你的情况下,我想你可以使用
dt [,粘贴(letter,collapse =,) id]
正如@ilir指出,如果实际上希望获取列表(而不是显示的字符),可以使用错误报告中建议的解决方法: / p>
dt [,list(list(I(letter))),by = id]
I'm trying to aggregate a data from a data.table to create a new column which is a list of previous rows. It's easier to see by example:
dt <- data.table(id = c(1,1,1,1,2,2,3,3,3), letter = c('a','a','b','c','a','c','b','b','a'))
I would like to aggregate this in such a ways that the result should be
id letter 1: 1 a,a,b,c 2: 2 a,c 3: 3 b,b,a
Intuitively I tried
dt[,j = list(list(letter)), by = id]
but that doesn't work. Oddly enough when I go case by case, for example:
> dt[id == 1,j = list(list(letter)), by = id] id V1 1: 1 a,a,b,c
the result is fine... I feel like I'm missing an
.SD
somewhere or something like that...Can anybody point me in the right direction?
Thanks!
解决方案Update: The behaviour
DT[, list(list(.)), by=.]
sometimes resulted in wrong results in R version >= 3.1.0. This is now fixed in commit #1280 in the current development version of data.table v1.9.3. From NEWS:
DT[, list(list(.)), by=.]
returns correct results in R >=3.1.0 as well. The bug was due to recent (welcoming) changes in R v3.1.0 wherelist(.)
does not result in a copy. Closes #481.With this update, it's not necessary for
I()
anymore. You can just do:DT[, list(list(.)), by=.]
as before.
This seems to be a similar issue as the known bug #5585. In your case, I think you could just use
dt[, paste(letter, collapse=","), by = id]
to fix your problem.
As @ilir pointed out, if it is actually desirable to get a list (rather than the displayed character), you could use the workaround suggested in the bug report:
dt[, list(list(I(letter))), by = id]
这篇关于data.table聚合到列表列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!