如何“取消列出" data.table 中的列 [英] How to 'unlist' a column in a data.table
问题描述
在我的表格中,一些单元格是向量而不是单个值,即列是列表而不是向量:
dt1 <- data.table(colA= c('A1','A2','A3'),colB=list('B1',c('B2a','B2b'),'B3'),colC= c('C1','C2','C3'),冷= c('D1','D2','D3'))dt1# colA colB colC colD#1:A1 B1 C1 D1#2:A2 B2a,B2b C2 D2#3:A3 B3 C3 D3
我需要将其重新整形为长格式,不列出该列 colB
.到目前为止,我是这样做的:
dt1[,.(colB=unlist(colB)),by=.(colA,colC,colD)]# colA colC colD colB#1:A1 C1 D1 B1#2:A2 C2 D2 B2a#3:A2 C2 D2 B2b#4:A3 C3 D3 B3
它可以完成这项工作,但我不喜欢我必须在 by=
中明确指出所有其他列名.有没有更好的方法来做到这一点?
(我确定它已经在其他地方得到了回答,但我目前找不到)
附:理想情况下,我想在没有任何外部包的情况下进行管理
将我的评论提升为答案.使用:
dt1[,.(colB = unlist(colB)), by = setdiff(names(dt1), 'colB')]
给予:
<块引用> colA colC colD colB1:A1 C1 D1 B12:A2 C2 D2 B2a3:A2 C2 D2 B2b4:A3 C3 D3 B3
或者作为替代方案(@Frank 的提议略有不同):
dt1[rep(dt1[,.I], lengths(colB))][, colB := unlist(dt1$colB)][]
in my table, some cells are vectors instead of single value, i.e. the column is a list instead of vector:
dt1 <- data.table(
colA= c('A1','A2','A3'),
colB=list('B1',c('B2a','B2b'),'B3'),
colC= c('C1','C2','C3'),
colD= c('D1','D2','D3')
)
dt1
# colA colB colC colD
#1: A1 B1 C1 D1
#2: A2 B2a,B2b C2 D2
#3: A3 B3 C3 D3
I need to reshape it to a long format unlisting that column colB
. So far I do it like this:
dt1[,.(colB=unlist(colB)),by=.(colA,colC,colD)]
# colA colC colD colB
#1: A1 C1 D1 B1
#2: A2 C2 D2 B2a
#3: A2 C2 D2 B2b
#4: A3 C3 D3 B3
it does the job but I don't like that I have to indicate all other column names explicitly in by=
. Is there better way to do this?
(I'm sure it's already answered elsewhere but I couldn't find it so far)
P.S. ideally I would like to manage without any external packages
Promoting my comment to an answer. Using:
dt1[,.(colB = unlist(colB)), by = setdiff(names(dt1), 'colB')]
gives:
colA colC colD colB 1: A1 C1 D1 B1 2: A2 C2 D2 B2a 3: A2 C2 D2 B2b 4: A3 C3 D3 B3
Or as an alternative (a slight variation of @Frank's proposal):
dt1[rep(dt1[,.I], lengths(colB))][, colB := unlist(dt1$colB)][]
这篇关于如何“取消列出" data.table 中的列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!