如何使用R中的igraph分别计算不同时期的网络度量? [英] How can I calculate network measures separately for different periods using igraph in r?

查看:114
本文介绍了如何使用R中的igraph分别计算不同时期的网络度量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的交易数据:

data:

id          from_id        to_id      amount    date_trx
<fctr>      <fctr>         <fctr>     <dbl>     <date>
0           7468           5695       700.0     2005-01-04
1           6213           9379       11832.0   2005-01-08
2           7517           8170       1000.0    2005-01-10
3           6143           9845       4276.0    2005-01-12
4           6254           9640       200.0     2005-01-14
5           6669           5815       200.0     2005-01-20
6           6934           8583       49752.0   2005-01-24
7           9240           8314       19961.0   2005-01-26
8           6374           8865       1000.0    2005-01-30
9           6143           6530       13.4      2005-01-31
...

我形成了一个网络,在该网络中,节点(帐户) from_id to_id 的值,边缘的权重则取决于它们的交易量。然后,我计算了网络的度量,例如度中心性,中间性中心,亲密性中心等。

I formed the network where the edges are formed between the nodes(accounts) from_id's and to_id's, and the weights of the edges determined by the amounts they transact. Then I calculated the network's measures such as degree centrality, betweenness centrality, closeness centrality etc.

即:

relations <- data.frame(from = data$from_id, 
                        to = data$to_id)
network <- graph_from_data_frame(relations, directed = T)

E(network)$weight <- data$amount
V(network)$degree <- degree(network, normalized=TRUE)
V(network)$betweenness <- betweenness(network, normalized=TRUE)
V(network)$closeness <- closeness(network, normalized=TRUE)

但是现在我想定期计算这些度量。例如,我想将数据按周划分(从第一个交易日开始),并计算每个帐户在相应周内的网络度量。

But now I want to calculate these measures periodically. For example, I want to divide my data by weeks(starting from the very first transaction date) and calculate the network measures for each account for corresponding weeks.

data$week <- unsplit(tapply(data$date_trx, data$from_id, function(x) (as.numeric(x-min(data$trx_date)) %/% 7)+1),data$from_id)

select(data, from_id, to_id, date_trx, week, amount) %>% arrange(date_trx)

from_id       to_id      date_trx      week    amount
<fctr>        <fctr>     <date>        <dbl>   <dbl>
6644           6934       2005-01-01    1      700
6753           8456       2005-01-01    1      600
9242           9333       2005-01-01    1      1000
9843           9115       2005-01-01    1      900 
7075           6510       2005-01-02    1      400 
8685           7207       2005-01-02    1      1100   

...            ...        ...           ...    ...

9866           6697       2010-12-31    313    95.8
9866           5992       2010-12-31    313    139.1
9866           5797       2010-12-31    313    72.1
9866           9736       2010-12-31    313    278.9
9868           8644       2010-12-31    313    242.8
9869           8399       2010-12-31    313    372.2

当我将数据划分为每周期间时,现在我需要分别形成每周的帐户网络,以便可以计算每周期间的帐户网络度量。如何在313周内一次执行该操作?

As I divided my data into weekly periods, now I need to form networks of accounts for each week separately and so that I can calculate network measures for accounts in weekly periods. How can I do that for 313 weeks and at once?

推荐答案

一种方法是按周拆分数据,每周转换一次到igraph对象中,然后使用lapply将中心度和度数一次添加到所有图形中。我的初始data.frame名为d(见下文):

One possibility is splitting your data according to week, transform each week into an igraph object and then add the centralities and degree to all graphs at once, using lapply. My initial data.frame is named d (see below):

library(igraph)

head(d)
  from_id to_id weight   date_trx
1       D     I      8 1999-09-12
2       E     H     10 1999-10-20
3       A     G     10 1999-09-10
4       C     G     13 1999-04-15
5       E     J      9 1999-06-26
6       B     F     15 1999-04-30

首先获得一周:

d$week <- strftime(d$date_trx, format = "%V")

现在按周拆分:

dd <- split(d, d$week )

每周将其转换为 igraph

dd <- lapply(dd, function(x) graph_from_data_frame(x, directed = T))

编写一个函数,执行您要执行的所有操作,然后将其应用于每个图形:

Write a function that does all the operations you want to carry out, and then apply it to each graph:

my.funct <- function(x) {
  V(x)$degree <- degree(x, normalized=TRUE)
  V(x)$betweenness <- betweenness(x, normalized=TRUE)
  V(x)$closeness <- closeness(x, normalized=TRUE)
  return(x)
}

dd <- lapply(dd, my.funct)

例如,对于第一周:

dd[[1]]
IGRAPH f515e52 DN-- 4 2 -- 
+ attr: name (v/c), degree (v/n), betweenness (v/n), closeness (v/n), weigth (e/n), date_trx
| (e/n), week (e/c)
+ edges from f515e52 (vertex names):
[1] B->F C->G



get.vertex.attribute(dd[[1]])
$name
[1] "B" "C" "F" "G"

$degree
[1] 0.3333333 0.3333333 0.3333333 0.3333333

$betweenness
[1] 0 0 0 0

$closeness
[1] 0.3333333 0.3333333 0.2500000 0.2500000



get.edge.attribute(dd[[1]])
$weight
[1] 9 7

$date_trx
[1] 10595 10601

$week
[1] "01" "01"

然后您可以检索所有星期的所有中心和学位:

You can then retrieve all centralities and degree for all weeks:

ddd <- lapply(dd, function(x) igraph::as_data_frame(x, what = "vertices") )

# keep in mind that `split` names the objects in the list according to
# the value it used to split, therefore the name of the data.frames in
# the list is the name of the week.

library(dplyr)
ddd <- bind_rows(ddd, .id="week")

head(ddd)
  week name    degree betweenness closeness
1   01    E 1.4444444           0 0.2000000
2   01    D 1.5555556           0 0.1666667
3   01    B 0.7777778           0 0.2000000
4   01    A 1.0000000           0 0.2000000
5   01    C 0.7777778           0 0.1666667
6   01    F 1.0000000           0 0.1000000

以防万一,您可以使用它合并回

In case, you can use this to merge back to the original edges list.

此示例中使用的数据:

set.seed(123)
d <- data.frame(from_id = sample(LETTERS[1:5], 2000, replace = T),
                to_id = sample(LETTERS[6:10], 2000, replace = T),
                weight = rpois(2000, 10),
                date_trx = sample(seq(as.Date('1999/01/01'), as.Date('2000/01/01'), by="day"), 2000, replace = T))

这篇关于如何使用R中的igraph分别计算不同时期的网络度量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆