在列表列中设置操作 [英] set operation within a list column

查看:144
本文介绍了在列表列中设置操作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在在设置操作 -with-a-list-column>这个

  DT<  -  data.table(exp = c(exp1,exp2,exp2),
sample = c(1L,1L,2L),
listdata = list(c(2L,5L),c(2L,3L ,5L,7L),c(1L,2L,6L)))

> DT
exp sample listdata
1:exp1 1 2,5
2:exp2 1 2,3,5,7
3:exp2 2 1,2,6

而非常麻烦,我可以做

  DT $ inc = list(setdiff(unlist(DT $ listdata [2]),unlist(DT $ listdata [1])))

,并获得一个新的列表列,其值为 c(3,7)。但是,如果我尝试使用

  DT $ inc = list(list(setdiff))计算当前行和第一行之间的差异(不列出(DT $ listdata,recursive = FALSE),unlist(DT $ listdata [1]))))

期待一个新列inc

  0 
c(3,7)
c(1 ,6)

我得到 c(3,7,1,6) 。显然 unlist 将整个列表列展开在一起。任何想法发生了什么?



我也在学习dplyr和data.table。因此,如果您可以使用其中一个提供解决方案,那将非常有帮助。

解决方案


[...]我尝试计算当前行和第一个行


嗯,你可以做...

  DT [,inc:=。(Map(setdiff,listdata,listdata [1L])]] 

#exp sample listdata inc
#1:exp1 1 2 ,5
#2:exp2 1 2,3,5,7 3,7
#3:exp2 2 1,2,6 1,6
/ pre>

但是,我认为只是不能使用列表列。






不使用列表列可能看起来像...

  DT [,r:= .I ] 
DT2 = DT [,c(.SD [rep(.I,length(listdata))],。(v = unlist(listdata))),.SDcols =!listdata]

#exp sample rv
#1:exp1 1 1 2
#2:exp1 1 1 5
#3:exp2 1 2 2
#4:exp2 1 2 3
#5:exp2 1 2 5
#6:exp2 1 2 7
#7:exp2 2 3 1
#8:exp2 2 3 2
#9:exp2 2 3 6

然后我们只是使用这个数据集,可以do

  DT2 [!DT2 [r == 1L],on =v] 

#exp sample rv
#1:exp2 1 2 3
#2:exp2 1 2 7
#3:exp2 2 3 1
#4:exp2 2 3 6


I am trying to do set operations between the vectors stored in a list column like this.

DT  <- data.table(exp = c("exp1", "exp2", "exp2"), 
                  sample = c(1L, 1L, 2L), 
                  listdata = list(c(2L,5L), c(2L,3L,5L,7L), c(1L,2L,6L)))

> DT
    exp sample listdata
1: exp1      1      2,5
2: exp2      1  2,3,5,7
3: exp2      2    1,2,6

while very cumbersome, I can do

DT$inc = list(setdiff(unlist(DT$listdata[2]), unlist(DT$listdata[1])))

and obtain a new list column with the value c(3,7). But if I try to calculate the difference between the current row and the first row using

DT$inc = list(list(setdiff(unlist(DT$listdata, recursive = FALSE), unlist(DT$listdata[1]))))

expecting a new column "inc"

0
c(3,7)
c(1,6)

I get c(3,7,1,6). Apparently unlist flattened the whole list column together. Any idea what's going on?

I am also learning dplyr and data.table. So it would really help if you can provide solutions using one of them.

解决方案

[...] I try to calculate the difference between the current row and the first row

Well, you can do...

DT[, inc := .(Map(setdiff, listdata, listdata[1L]))]

#     exp sample listdata inc
# 1: exp1      1      2,5    
# 2: exp2      1  2,3,5,7 3,7
# 3: exp2      2    1,2,6 1,6

But I think it's far better to just not work with list columns.


Not working with list columns might look like...

DT[, r := .I]
DT2 = DT[,c(.SD[rep(.I, lengths(listdata))], .(v = unlist(listdata))), .SDcols=!"listdata"]

#     exp sample r v
# 1: exp1      1 1 2
# 2: exp1      1 1 5
# 3: exp2      1 2 2
# 4: exp2      1 2 3
# 5: exp2      1 2 5
# 6: exp2      1 2 7
# 7: exp2      2 3 1
# 8: exp2      2 3 2
# 9: exp2      2 3 6

Then we just work with this data set, and can do

DT2[!DT2[r==1L], on="v"]

#     exp sample r v
# 1: exp2      1 2 3
# 2: exp2      1 2 7
# 3: exp2      2 3 1
# 4: exp2      2 3 6

这篇关于在列表列中设置操作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆