如何以邻居列表格式加载图? [英] How do I load a graph in neighborhood list format?

查看:81
本文介绍了如何以邻居列表格式加载图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个描述有向图的邻居列表文件:

I have a file of neighborhood lists describing a directed graph:

1 2 5
2 4

等同于边缘列表格式:

1 2
1 5
2 4

如何将其加载到igraph中?

How do I load it into igraph?

我可以使用 read.lines strsplit ,但我觉得这是别人做过的.

I can use read.lines and strsplit but I have a feeling that this has been done before by someone else.

推荐答案

如果您愿意使用仍在开发中的软件包,我建议您探索软件包.它的文件阅读器非常快( table中fread的意思),并且具有一些拆分功能.与我的"splitstackshape"软件包中的cSplit结合使用.

If you are open to using a package still in development, I would suggest exploring the "iotools" package. It's file reader is fast (think along the lines of fread from "data.table") and it includes some splitting features. Use it in conjunction with cSplit from my "splitstackshape" package.

这是一个具有1M行的可重现示例:

Here's a reproducible example with 1M rows:

首先,提供一些示例数据的函数:

First, a function to make some sample data:

data.maker <- function(size) {
  set.seed(1)
  lapply(seq_len(size), function(x) {
    as.character(c(x, sample(100, sample(20), TRUE)))
  })
}

x <- data.maker(1000000)
writeLines(vapply(x, paste, FUN.VALUE = character(1L), collapse = "\t"), "mytest.txt")

第二,加载"dplyr"用于管道,"iotools"用于快速读取,"splitstackshape"(也加载"data.table")用于拆分和聚合.

Second, load "dplyr" for piping, "iotools" for fast reading, and "splitstackshape" (which also loads "data.table") for splitting and aggregating.

library(dplyr)
library(iotools)
library(splitstackshape)

这是一个整体:

system.time({
  out <- input.file("mytest.txt", formatter = mstrsplit, sep = NA, nsep = "\t") %>%
    as.data.table(keep.rownames = TRUE) %>%
    cSplit("V1", "\t", "long") %>%
    .[, .N, by = .(rn, V1)]
})
#    user  system elapsed 
#  26.109   0.096  26.200 

输出视图:

out
#               rn V1 N
#       1:       1 94 1
#       2:       1 22 1
#       3:       1 66 1
#       4:       1 13 1
#       5:       1 27 1
#      ---             
# 9865359: 1000000  1 1
# 9865360: 1000000 85 1
# 9865361: 1000000 91 1
# 9865362: 1000000 44 1
# 9865363: 1000000 20 1
summary(out)
#       rn                  V1              N        
#  Length:9865363     Min.   :  1.0   Min.   :1.000  
#  Class :character   1st Qu.: 25.0   1st Qu.:1.000  
#  Mode  :character   Median : 51.0   Median :1.000  
#                     Mean   : 50.5   Mean   :1.064  
#                     3rd Qu.: 75.0   3rd Qu.:1.000  
#                     Max.   :100.0   Max.   :5.000 


如果您希望使用更多标准软件包,可以尝试以下方法.它也应该相当快:


If you prefer more standard packages, you can try the following. It should also be reasonably fast:

library(dplyr)
library(stringi)
library(data.table)


temp <- stri_split_fixed(readLines("mytest.txt"), "\t", n = 2, simplify = TRUE) %>%
  as.data.table %>%
  .[, list(V2 = unlist(strsplit(V2, "\t", TRUE))), by = V1] %>%
  .[, .N, by = .(V1, V2)]

这篇关于如何以邻居列表格式加载图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆