Python:如何使用Python生成随机稀疏对称矩阵? [英] Python: how to use Python to generate a random sparse symmetric matrix?

查看:233
本文介绍了Python:如何使用Python生成随机稀疏对称矩阵?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何使用python生成随机稀疏对称矩阵?

How to use python to generate a random sparse symmetric matrix ?

在MATLAB中,我们有一个函数" sprandsym(大小,密度)"

In MATLAB, we have a function "sprandsym (size, density)"

但是如何在Python中做到这一点?

But how to do that in Python?

推荐答案

如果您有scipy,则可以使用

If you have scipy, you could use sparse.random. The sprandsym function below generates a sparse random matrix X, takes its upper triangular half, and adds its transpose to itself to form a symmetric matrix. Since this doubles the diagonal values, the diagonals are subtracted once.

非零值正态分布,平均值为0,标准差为 的1.Kolomogorov-Smirnov检验用于检查非零值是否为 与来自正态分布的图形,直方图和 也会生成QQ图以可视化分布.

The non-zero values are normally distributed with mean 0 and standard deviation of 1. The Kolomogorov-Smirnov test is used to check that the non-zero values is consistent with a drawing from a normal distribution, and a histogram and QQ-plot is generated too to visualize the distribution.

import numpy as np
import scipy.stats as stats
import scipy.sparse as sparse
import matplotlib.pyplot as plt
np.random.seed((3,14159))

def sprandsym(n, density):
    rvs = stats.norm().rvs
    X = sparse.random(n, n, density=density, data_rvs=rvs)
    upper_X = sparse.triu(X) 
    result = upper_X + upper_X.T - sparse.diags(X.diagonal())
    return result

M = sprandsym(5000, 0.01)
print(repr(M))
# <5000x5000 sparse matrix of type '<class 'numpy.float64'>'
#   with 249909 stored elements in Compressed Sparse Row format>

# check that the matrix is symmetric. The difference should have no non-zero elements
assert (M - M.T).nnz == 0

statistic, pval = stats.kstest(M.data, 'norm')
# The null hypothesis is that M.data was drawn from a normal distribution.
# A small p-value (say, below 0.05) would indicate reason to reject the null hypothesis.
# Since `pval` below is > 0.05, kstest gives no reason to reject the hypothesis
# that M.data is normally distributed.
print(statistic, pval)
# 0.0015998040114 0.544538788914

fig, ax = plt.subplots(nrows=2)
ax[0].hist(M.data, normed=True, bins=50)
stats.probplot(M.data, dist='norm', plot=ax[1])
plt.show()

PS.我用

upper_X = sparse.triu(X) 
result = upper_X + upper_X.T - sparse.diags(X.diagonal())

代替

 result = (X + X.T)/2.0

因为我无法说服自己(X + X.T)/2.0中的非零元素具有正确的分布.首先,如果X是密集的并且正态分布为均值0和方差1,即N(0, 1),则(X + X.T)/2.0将为N(0, 1/2).当然,我们可以使用

because I could not convince myself that the non-zero elements in (X + X.T)/2.0 have the right distribution. First, if X were dense and normally distributed with mean 0 and variance 1, i.e. N(0, 1), then (X + X.T)/2.0 would be N(0, 1/2). Certainly we could fix this by using

 result = (X + X.T)/sqrt(2.0)

相反.那么result将是N(0, 1).但是还有另一个问题:如果X是稀疏的,那么在非零位置,X + X.T通常是正态分布的随机变量加零.除以sqrt(2.0)会将正态分布压缩到接近于0的水平,从而使分布更加尖峰.随着X变得稀疏,这可能越来越不像正态分布了.

instead. Then result would be N(0, 1). But there is yet another problem: If X is sparse, then at nonzero locations, X + X.T would often be a normally distributed random variable plus zero. Dividing by sqrt(2.0) will squash the normal distribution closer to 0 giving you a more tightly spiked distribution. As X becomes sparser, this may be less and less like a normal distribution.

由于我不知道(X + X.T)/sqrt(2.0)会产生什么分布,所以我选择复制X的上三角一半(因此重复我所知道的正态分布的非零值).

Since I didn't know what distribution (X + X.T)/sqrt(2.0) generates, I opted for copying the upper triangular half of X (thus repeating what I know to be normally distributed non-zero values).

这篇关于Python:如何使用Python生成随机稀疏对称矩阵?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆