如何在R中创建关系矩阵? [英] How to create relational matrix in R?
问题描述
原始数据:
df <- structure(list(ID_client = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), .Label = c("1_", "2_", "3_", "4_"), class = "factor"), Connected = c(1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L), Year = c(2010L, 2010L, 2010L, 2010L, 2015L, 2015L, 2015L, 2015L)), class = "data.frame", row.names = c(NA, -8L))
原始数据:
`ID_client Connected Year
1_ 1 2010
2_ 1 2010
3_ 1 2010
4_ 0 2010
1_ 1 2015
2_ 0 2015
3_ 1 2015
4_ 0 2015`
我的意图是创建以下数据:
My intention is to create the following data:
`Year ID_client 1_ 2_ 3_ 4_
2010 1_ 0 1 1 0
2010 2_ 1 0 1 0
2010 3_ 1 1 0 0
2010 4_ 0 0 0 0
2015 1_ 0 0 1 0
2015 2_ 0 0 0 0
2015 3_ 1 0 0 0
2015 4_ 0 0 0 0`
换句话说,连接了一个矩阵,该矩阵表示在例如2010个客户端1_,2_和3_中,而另一个未连接.重要的是,我不认为有人与自己有联系.
In other words, a matrix that express that in, for instance, 2010 clients 1_, 2_, and 3_ were connected, while the other one was not. Importantly, I do not consider someone to be connected with herself.
我尝试了以下代码:
df %>%
group_by(Year, Connected) %>%
mutate(temp = rev(ID_client)) %>%
pivot_wider(names_from = ID_client,
values_from = Connected,
values_fill = list(Connected = 0)) %>%
arrange(Year, temp)
此代码无法再现我所需要的.相反,这是结果:
This code does not reproduce what I need. Instead, this is the result:
`Year ID_client 1_ 2_ 3_ 4_
2010 1_ 0 0 1 0
2010 2_ 0 1 0 0
2010 3_ 1 0 0 0
2010 4_ 0 0 0 0
2015 1_ 0 0 1 0
2015 2_ 0 0 0 0
2015 3_ 1 0 0 0
2015 4_ 0 0 0 0`
推荐答案
我们可以group_by
Year
并创建一个具有ID_client
值的新列,该列在每个组中除当前值外均具有Connected == 1
.我们complete
缺少的级别,然后将数据转换为宽格式.
We can group_by
Year
and create a new column with ID_client
values which has Connected == 1
in each group except for the current value. We complete
the missing levels and then cast the data to wide format.
library(tidyverse)
df %>%
group_by(Year) %>%
mutate(temp = map(ID_client, ~setdiff(ID_client[Connected == 1], .x))) %>%
unnest(cols = temp) %>%
complete(temp = unique(ID_client), fill = list(Connected = 0)) %>%
mutate(ID_client = coalesce(as.character(ID_client), temp)) %>%
pivot_wider(names_from = temp,
values_from = Connected,
values_fill = list(Connected = 0)) %>%
arrange(Year, ID_client)
# Year ID_client `1_` `2_` `3_` `4_`
# <int> <chr> <dbl> <dbl> <dbl> <dbl>
#1 2010 1_ 0 1 1 0
#2 2010 2_ 1 0 1 0
#3 2010 3_ 1 1 0 0
#4 2010 4_ 0 0 0 0
#5 2015 1_ 0 0 1 0
#6 2015 2_ 0 0 0 0
#7 2015 3_ 1 0 0 0
#8 2015 4_ 0 0 0 0
这篇关于如何在R中创建关系矩阵?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!