从长格式的值重建对称矩阵 [英] Reconstruct symmetric matrix from values in long-form

查看:70
本文介绍了从长格式的值重建对称矩阵的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个看起来像这样(长格式)的tsv:

I have a tsv that looks like this (long-form):

  one   two   value
  a     b     30
  a     c     40
  a     d     20
  b     c     10
  b     d     05
  c     d     30

我正在尝试将其放入R(或大熊猫)的数据框中

I'm trying to get this into a dataframe for R (or pandas)

    a  b  c  d 
a   00 30 40 20
b   30 00 10 05 
c   40 10 00 30
d   20 05 30 00

问题是,在我的电视节目中,我只定义了a,b,而没有b,a.所以我在数据框中得到了很多NA.

The problem is, in my tsv I only have a, b defined and not b,a. So I get a lot of NAs in my dataframe.

最终目标是获得一个距离矩阵以用于聚类.任何帮助将不胜感激.

The final goal is to get a distance matrix to use in clustering. Any help would be appreciated.

推荐答案

一种igraph解决方案,您在其中读取数据帧,并将该值假定为边缘权重.然后,您可以将其转换为邻接矩阵

An igraph solution where you read in the dataframe, with the value assumed as edge weights. You can then convert this to an adjacency matrix

dat <- read.table(header=T, text=" one   two   value
  a     b     30
  a     c     40
  a     d     20
  b     c     10
  b     d     05
  c     d     30")

library(igraph)

# Make undirected so that graph matrix will be symmetric
g <- graph.data.frame(dat, directed=FALSE)

# add value as a weight attribute
get.adjacency(g, attr="value", sparse=FALSE)
#   a  b  c  d
#a  0 30 40 20
#b 30  0 10  5
#c 40 10  0 30
#d 20  5 30  0

这篇关于从长格式的值重建对称矩阵的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆