Python:如何使用Python生成随机稀疏对称矩阵? [英] Python: how to use Python to generate a random sparse symmetric matrix?
问题描述
如何使用python生成随机稀疏对称矩阵?
How to use python to generate a random sparse symmetric matrix ?
在MATLAB中,我们有一个函数" sprandsym(大小,密度)"
In MATLAB, we have a function "sprandsym (size, density)"
但是如何在Python中做到这一点?
But how to do that in Python?
推荐答案
If you have scipy, you could use sparse.random. The sprandsym
function below generates a sparse random matrix X, takes its upper triangular half, and adds its transpose to itself to form a symmetric matrix. Since this doubles the diagonal values, the diagonals are subtracted once.
非零值正态分布,平均值为0,标准差为 的1.Kolomogorov-Smirnov检验用于检查非零值是否为 与来自正态分布的图形,直方图和 也会生成QQ图以可视化分布.
The non-zero values are normally distributed with mean 0 and standard deviation of 1. The Kolomogorov-Smirnov test is used to check that the non-zero values is consistent with a drawing from a normal distribution, and a histogram and QQ-plot is generated too to visualize the distribution.
import numpy as np
import scipy.stats as stats
import scipy.sparse as sparse
import matplotlib.pyplot as plt
np.random.seed((3,14159))
def sprandsym(n, density):
rvs = stats.norm().rvs
X = sparse.random(n, n, density=density, data_rvs=rvs)
upper_X = sparse.triu(X)
result = upper_X + upper_X.T - sparse.diags(X.diagonal())
return result
M = sprandsym(5000, 0.01)
print(repr(M))
# <5000x5000 sparse matrix of type '<class 'numpy.float64'>'
# with 249909 stored elements in Compressed Sparse Row format>
# check that the matrix is symmetric. The difference should have no non-zero elements
assert (M - M.T).nnz == 0
statistic, pval = stats.kstest(M.data, 'norm')
# The null hypothesis is that M.data was drawn from a normal distribution.
# A small p-value (say, below 0.05) would indicate reason to reject the null hypothesis.
# Since `pval` below is > 0.05, kstest gives no reason to reject the hypothesis
# that M.data is normally distributed.
print(statistic, pval)
# 0.0015998040114 0.544538788914
fig, ax = plt.subplots(nrows=2)
ax[0].hist(M.data, normed=True, bins=50)
stats.probplot(M.data, dist='norm', plot=ax[1])
plt.show()
PS.我用
upper_X = sparse.triu(X)
result = upper_X + upper_X.T - sparse.diags(X.diagonal())
代替
result = (X + X.T)/2.0
因为我无法说服自己(X + X.T)/2.0
中的非零元素具有正确的分布.首先,如果X
是密集的并且正态分布为均值0和方差1,即N(0, 1)
,则(X + X.T)/2.0
将为N(0, 1/2)
.当然,我们可以使用
because I could not convince myself that the non-zero elements in (X + X.T)/2.0
have the right distribution. First, if X
were dense and normally distributed with mean 0 and variance 1, i.e. N(0, 1)
, then (X + X.T)/2.0
would be N(0, 1/2)
. Certainly we could fix this by using
result = (X + X.T)/sqrt(2.0)
相反.那么result
将是N(0, 1)
.但是还有另一个问题:如果X
是稀疏的,那么在非零位置,X + X.T
通常是正态分布的随机变量加零.除以sqrt(2.0)
会将正态分布压缩到接近于0的水平,从而使分布更加尖峰.随着X
变得稀疏,这可能越来越不像正态分布了.
instead. Then result
would be N(0, 1)
. But there is yet another problem: If X
is sparse, then at nonzero locations, X + X.T
would often be a normally distributed random variable plus zero. Dividing by sqrt(2.0)
will squash the normal distribution closer to 0 giving you a more tightly spiked distribution. As X
becomes sparser, this may be less and less like a normal distribution.
由于我不知道(X + X.T)/sqrt(2.0)
会产生什么分布,所以我选择复制X
的上三角一半(因此重复我所知道的正态分布的非零值).
Since I didn't know what distribution (X + X.T)/sqrt(2.0)
generates, I opted for copying the upper triangular half of X
(thus repeating what I know to be normally distributed non-zero values).
这篇关于Python:如何使用Python生成随机稀疏对称矩阵?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!