在列表列中设置操作 [英] set operation within a list column
问题描述
我正在在设置操作 -with-a-list-column>这个。
DT< - data.table(exp = c(exp1,exp2,exp2),
sample = c(1L,1L,2L),
listdata = list(c(2L,5L),c(2L,3L ,5L,7L),c(1L,2L,6L)))
> DT
exp sample listdata
1:exp1 1 2,5
2:exp2 1 2,3,5,7
3:exp2 2 1,2,6
而非常麻烦,我可以做
DT $ inc = list(setdiff(unlist(DT $ listdata [2]),unlist(DT $ listdata [1])))
,并获得一个新的列表列,其值为 c(3,7)
。但是,如果我尝试使用
DT $ inc = list(list(setdiff))计算当前行和第一行之间的差异(不列出(DT $ listdata,recursive = FALSE),unlist(DT $ listdata [1]))))
期待一个新列inc
0
c(3,7)
c(1 ,6)
我得到 c(3,7,1,6)
。显然 unlist
将整个列表列展开在一起。任何想法发生了什么?
我也在学习dplyr和data.table。因此,如果您可以使用其中一个提供解决方案,那将非常有帮助。
[...]我尝试计算当前行和第一个行
嗯,你可以做...
DT [,inc:=。(Map(setdiff,listdata,listdata [1L])]]
/ pre>
#exp sample listdata inc
#1:exp1 1 2 ,5
#2:exp2 1 2,3,5,7 3,7
#3:exp2 2 1,2,6 1,6
但是,我认为只是不能使用列表列。
不使用列表列可能看起来像...
DT [,r:= .I ]
DT2 = DT [,c(.SD [rep(.I,length(listdata))],。(v = unlist(listdata))),.SDcols =!listdata]
#exp sample rv
#1:exp1 1 1 2
#2:exp1 1 1 5
#3:exp2 1 2 2
#4:exp2 1 2 3
#5:exp2 1 2 5
#6:exp2 1 2 7
#7:exp2 2 3 1
#8:exp2 2 3 2
#9:exp2 2 3 6
然后我们只是使用这个数据集,可以do
DT2 [!DT2 [r == 1L],on =v]
#exp sample rv
#1:exp2 1 2 3
#2:exp2 1 2 7
#3:exp2 2 3 1
#4:exp2 2 3 6
I am trying to do set operations between the vectors stored in a list column like this.
DT <- data.table(exp = c("exp1", "exp2", "exp2"), sample = c(1L, 1L, 2L), listdata = list(c(2L,5L), c(2L,3L,5L,7L), c(1L,2L,6L))) > DT exp sample listdata 1: exp1 1 2,5 2: exp2 1 2,3,5,7 3: exp2 2 1,2,6
while very cumbersome, I can do
DT$inc = list(setdiff(unlist(DT$listdata[2]), unlist(DT$listdata[1])))
and obtain a new list column with the value
c(3,7)
. But if I try to calculate the difference between the current row and the first row usingDT$inc = list(list(setdiff(unlist(DT$listdata, recursive = FALSE), unlist(DT$listdata[1]))))
expecting a new column "inc"
0 c(3,7) c(1,6)
I get
c(3,7,1,6)
. Apparentlyunlist
flattened the whole list column together. Any idea what's going on?I am also learning dplyr and data.table. So it would really help if you can provide solutions using one of them.
解决方案[...] I try to calculate the difference between the current row and the first row
Well, you can do...
DT[, inc := .(Map(setdiff, listdata, listdata[1L]))] # exp sample listdata inc # 1: exp1 1 2,5 # 2: exp2 1 2,3,5,7 3,7 # 3: exp2 2 1,2,6 1,6
But I think it's far better to just not work with list columns.
Not working with list columns might look like...
DT[, r := .I] DT2 = DT[,c(.SD[rep(.I, lengths(listdata))], .(v = unlist(listdata))), .SDcols=!"listdata"] # exp sample r v # 1: exp1 1 1 2 # 2: exp1 1 1 5 # 3: exp2 1 2 2 # 4: exp2 1 2 3 # 5: exp2 1 2 5 # 6: exp2 1 2 7 # 7: exp2 2 3 1 # 8: exp2 2 3 2 # 9: exp2 2 3 6
Then we just work with this data set, and can do
DT2[!DT2[r==1L], on="v"] # exp sample r v # 1: exp2 1 2 3 # 2: exp2 1 2 7 # 3: exp2 2 3 1 # 4: exp2 2 3 6
这篇关于在列表列中设置操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!