如何在R中创建关系矩阵? [英] How to create relational matrix in R?

查看:356
本文介绍了如何在R中创建关系矩阵?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

原始数据:

df <- structure(list(ID_client = structure(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), .Label = c("1_", "2_", "3_", "4_"), class = "factor"), Connected = c(1L, 1L, 1L, 0L, 1L, 0L, 1L, 0L), Year = c(2010L, 2010L, 2010L, 2010L, 2015L, 2015L, 2015L, 2015L)), class = "data.frame", row.names = c(NA, -8L))

原始数据:

`ID_client Connected  Year
1_            1      2010
2_            1      2010
3_            1      2010
4_            0      2010
1_            1      2015
2_            0      2015
3_            1      2015
4_            0      2015`

我的意图是创建以下数据:

My intention is to create the following data:

`Year ID_client    1_   2_   3_   4_
2010     1_       0    1    1    0
2010     2_       1    0    1    0
2010     3_       1    1    0    0
2010     4_       0    0    0    0
2015     1_       0    0    1    0
2015     2_       0    0    0    0
2015     3_       1    0    0    0
2015     4_       0    0    0    0`

换句话说,连接了一个矩阵,该矩阵表示在例如2010个客户端1_,2_和3_中,而另一个未连接.重要的是,我不认为有人与自己有联系.

In other words, a matrix that express that in, for instance, 2010 clients 1_, 2_, and 3_ were connected, while the other one was not. Importantly, I do not consider someone to be connected with herself.

我尝试了以下代码:

df %>%
  group_by(Year, Connected) %>%
  mutate(temp = rev(ID_client)) %>%
  pivot_wider(names_from = ID_client, 
          values_from = Connected, 
          values_fill = list(Connected = 0)) %>%
  arrange(Year, temp)

此代码无法再现我所需要的.相反,这是结果:

This code does not reproduce what I need. Instead, this is the result:

`Year ID_client    1_   2_   3_   4_
2010     1_       0    0    1    0
2010     2_       0    1    0    0
2010     3_       1    0    0    0
2010     4_       0    0    0    0
2015     1_       0    0    1    0
2015     2_       0    0    0    0
2015     3_       1    0    0    0
2015     4_       0    0    0    0`

推荐答案

我们可以group_by Year并创建一个具有ID_client值的新列,该列在每个组中除当前值外均具有Connected == 1.我们complete缺少的级别,然后将数据转换为宽格式.

We can group_by Year and create a new column with ID_client values which has Connected == 1 in each group except for the current value. We complete the missing levels and then cast the data to wide format.

library(tidyverse)

df %>%
  group_by(Year) %>%
  mutate(temp = map(ID_client, ~setdiff(ID_client[Connected == 1], .x))) %>%
  unnest(cols = temp) %>%
  complete(temp = unique(ID_client), fill = list(Connected = 0)) %>%
  mutate(ID_client  = coalesce(as.character(ID_client), temp)) %>%
  pivot_wider(names_from = temp, 
              values_from = Connected, 
              values_fill = list(Connected = 0)) %>%
  arrange(Year, ID_client)

#   Year ID_client  `1_`  `2_`  `3_`  `4_`
#  <int> <chr>     <dbl> <dbl> <dbl> <dbl>
#1  2010 1_            0     1     1     0
#2  2010 2_            1     0     1     0
#3  2010 3_            1     1     0     0
#4  2010 4_            0     0     0     0
#5  2015 1_            0     0     1     0
#6  2015 2_            0     0     0     0
#7  2015 3_            1     0     0     0
#8  2015 4_            0     0     0     0

这篇关于如何在R中创建关系矩阵?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆