从R中的衣衫Data的数据帧创建边缘列表(用于网络分析) [英] Create Edge List From Ragged Data Frame in R (for network analysis)

查看:67
本文介绍了从R中的衣衫Data的数据帧创建边缘列表(用于网络分析)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的数据帧参差不齐,每一行都是一个或多个实体在时间上的出现,就像这样:

I have a ragged data frame with each row as an occurrence in time of one or more entities, like so:

(time1) entitya entityf entityz
(time2) entityg entityh
(time3) entityo entityp entityk entityL
(time4) entityM

我想从第二个向量(节点列表)中找到的实体子集创建用于网络分析的边缘列表.我的问题是我不知道:

I want to create an edge list for network analysis from a subset of entities found in a second vector (nodelist). My problem is that I don't know:

1).如何仅对节点列表中的实体进行子集化.我正在考虑

1). How to subset only the entities in the nodelist. I was considering

datanew<- subset(dataold, dataold %in% nodelist)

但它不起作用.

2).如何使衣衫data的数据帧成为两列边缘列表.在上面的示例中,它将转换为:

2). How to make ragged data frame into a two column edge list. In the above example, it would transform to:

entitya entityf
entitya entityz
entityz entityf
...

不知道如何执行此操作.任何帮助,我们都感激不尽!

NO idea how to do this. Any help is really appreciated!

推荐答案

尝试一下:

# read your data 

dat <- strsplit(readLines(textConnection("(time1) entitya entityf entityz
(time2) entityg entityh
(time3) entityo entityp entityk entityL
(time4) entityM")), " ")

# remove (time)

dat <- lapply(dat, `[`, -1)

# filter

nodelist <- c("entitya", "entityf", "entityz", "entityg", "entityh",
              "entityo", "entityp", "entityk")

dat <- lapply(dat, intersect, nodelist)

# create an edge matrix

t(do.call(cbind, lapply(dat[sapply(dat, length) >= 2], combn, 2)))

这最后一步可能要消化很多,所以这里是一个突破:

This last step might be a lot to digest, so here is a breakout:

  • sapply(dat,length)计算列表元素的长度
  • dat [...> = 2] 仅使列表元素至少包含两个项目
  • lapply(...,combn,2)创建所有组合:宽矩阵列表
  • do.call(cbind,...)将所有组合绑定到一个宽矩阵中
  • t(...)转置为一个高矩阵
  • sapply(dat, length) computes the lengths of your list elements
  • dat[... >= 2] only keeps the list elements with at least two items
  • lapply(..., combn, 2) creates all combinations: a list of wide matrices
  • do.call(cbind, ...) binds all the combinations into a wide matrix
  • t(...) transposes into a tall matrix

这篇关于从R中的衣衫Data的数据帧创建边缘列表(用于网络分析)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆