在R中的列表中的相交向量的联合 [英] Union of intersecting vectors in a list in R

查看:171
本文介绍了在R中的列表中的相交向量的联合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个向量列表如下。

  data<  -  list(v1 = c b,c),v2 = c(g,h,k),
v3 = c ,a),v5 = c(h,i))

尝试实现以下



1)检查任何向量是否相交。



如果找到相交的向量,得到它们的并集。



所需的输出是

 <$ c = c(a,b,c,d,n),v2 = c(g,h,k ,i))



我可以得到一组相交集合的并集如下。 / p>

  Reduce(union,list(data [[1]],data [[3]],data [[4]]) )
Reduce(union,list(data [[2]],data [[5]])


$ b b

如何首先识别交叉向量?有没有将列表分成相交向量组的列表的方法?



更新



这里是使用data.table的尝试。获取所需的结果。但对于大型列表,此示例数据集仍然很慢。

 数据集。 
data< - sapply(data,function(x)paste(x,collapse =,))
data< - as.data.frame(data,stringsAsFactors = F)

repeat {
M < - nrow(data)
data < - data.table(data,key =data)
data < list(dataelement = unique(unlist(strsplit(data,,))),by = list(data)]
data< - data.table(data,key =dataelement)
data < - data [,list(data = paste0(sort(unique(unlist(strsplit(data,split =,))),collapse =,)),by =dataelement]
data $ dataelement< - NULL
data< - unique(data)
N < - nrow(data)
if(M == N)
break
}

data< - strsplit(as.character(data $ data),,)


解决方案

这就像一个图形问题,所以我喜欢使用 igraph 这样,使用你的示例数据,你可以做到

  library(eigraph)
#build edgelist
el < - do.call(rbind,lapply(data,embed,2))
#make a graph
gg< - graph.edgelist(el,directed = F)
#partition the graph into disjoint set
split(V(gg)$ name,clusters(gg)$ membership)

#$`1`
#[1] acdn

#$`2`
#[1]hgki

我们可以使用

查看结果

  V(gg)$ color = c(green,purple)[clusters(gg)$ membership] 
plot(gg)

/ p>

I have a list of vectors as follows.

data <- list(v1=c("a", "b", "c"), v2=c("g", "h", "k"), 
             v3=c("c", "d"), v4=c("n", "a"), v5=c("h", "i"))

I am trying to achieve the following

1) Check whether any of the vectors intersect with each other.

2) If intersecting vectors are found, get their union.

So the desired output is

out <- list(v1=c("a", "b", "c", "d", "n"), v2=c("g", "h", "k", "i"))

I can get the union of a group of intersecting sets as follows.

 Reduce(union, list(data[[1]], data[[3]], data[[4]]))
 Reduce(union, list(data[[2]], data[[5]])

How to first identify the intersecting vectors? Is there a way of dividing the list into lists of groups of intersecting vectors?

Update

Here is an attempt using data.table. Gets the desired results. But still slow for large lists as in this example dataset.

datasets. 
data <- sapply(data, function(x) paste(x, collapse=", "))
data <- as.data.frame(data, stringsAsFactors = F)

repeat {
  M <- nrow(data)
  data <- data.table( data , key = "data" )
  data <- data[ , list(dataelement = unique(unlist(strsplit(data , ", " )))), by = list(data)]
  data <- data.table(data , key = "dataelement" )
  data <- data[, list(data = paste0(sort(unique(unlist(strsplit(data, split=", ")))), collapse=", ")), by = "dataelement"]
  data$dataelement <- NULL
  data <- unique(data)
  N <- nrow(data)
  if (M == N)
    break
}

data <- strsplit(as.character(data$data) , "," )

解决方案

This is kind of like a graph problem so I like to use the igraph library for this, using your sample data, you can do

library(igraph)
#build edgelist
el <- do.call("rbind",lapply(data, embed, 2))
#make a graph
gg <- graph.edgelist(el, directed=F)
#partition the graph into disjoint sets
split(V(gg)$name, clusters(gg)$membership)

# $`1`
# [1] "b" "a" "c" "d" "n"
# 
# $`2`
# [1] "h" "g" "k" "i"

And we can view the results with

V(gg)$color=c("green","purple")[clusters(gg)$membership]
plot(gg)

这篇关于在R中的列表中的相交向量的联合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆