在data.table中取消列出嵌套列表列 [英] Unlist nested list columns in data.table
问题描述
在data.table中取消列出嵌套列表列。假设所有列表元素都是相同类型。列表元素被命名,名称也必须被处理。
它与 data.table聚合到列列。
我认为值得在SO data.table 知识库。
我目前的解决方法如下, m寻找一些更正规的答案。
(data.table)
dt< - data.table(
a = letters [1:3],
l = list(c1 = 6L,c2 = 4L),list = 2L,y = 4L,z = 3L),list())
)
dt []
#al
#1:a< list>
#2:b< list>
#3:c< list>
dt [,。(a = rep(a,length(l)),
nm = names(unlist(l)),
ul = unlist b。(id = seq_along(a))
] [,id:= NULL
] []
#a nm ul
#1:a c1 6
#2:a c2 4
#3:bx 2
#4:由4
#5:bz 3
#6:c NA NA
不确定它是更多的规范,但这里是一种方法来修改 l
,所以你可以使用 by = a
,考虑你知道你的数据在列表中的类型DavidArenburg):
dt [lengths(l)== 0,l:= NA_integer _] [,。 unlist(l)),ul = unlist(l)),by = a]
pre>
#a nm ul
#1:a c1 6
#2:a c2 4
#3:bx 2
#4:by 4
#5:bz 3
#6:c NA NA
Unlist nested list column in data.table. Assuming all the list elements are the same type. The list elements are named, the name has to be handled also.
It is somehow opposite operation to data.table aggregation to list column.
I think it is worth to have it in SO data.table knowledge base.
My current workaround approach below, I'm looking for a little bit more canonical answer.library(data.table) dt <- data.table( a = letters[1:3], l = list(list(c1=6L, c2=4L), list(x=2L, y=4L, z=3L), list()) ) dt[] # a l # 1: a <list> # 2: b <list> # 3: c <list> dt[,.(a = rep(a,length(l)), nm = names(unlist(l)), ul = unlist(l)), .(id = seq_along(a)) ][, id := NULL ][] # a nm ul # 1: a c1 6 # 2: a c2 4 # 3: b x 2 # 4: b y 4 # 5: b z 3 # 6: c NA NA
解决方案Not sure it is more "canonical" but here is a way to modify
l
so you can useby=a
, considering you know the type of your data in list (with some improvements, thanks to @DavidArenburg):dt[lengths(l) == 0, l := NA_integer_][, .(nm = names(unlist(l)), ul = unlist(l)), by = a] # a nm ul #1: a c1 6 #2: a c2 4 #3: b x 2 #4: b y 4 #5: b z 3 #6: c NA NA
这篇关于在data.table中取消列出嵌套列表列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!