使用Numpy生成随机相关的x和y点 [英] Generating random correlated x and y points using Numpy
问题描述
我想生成x和y坐标的相关数组,以便测试各种matplotlib绘制方法,但是我在某个地方失败了,因为无法获得numpy.random.multivariate_normal
给我想要的样本.理想情况下,我希望我的x值介于-0.51和51.2之间,而我的y值介于0.33和51.6之间(尽管我认为相等的范围是可以的,因为以后可以约束该图),但是我不确定是什么意思( 0、0?)和协方差值,我应该从函数中获取这些样本.
I'd like to generate correlated arrays of x and y coordinates, in order to test various matplotlib plotting approaches, but I'm failing somewhere, because I can't get numpy.random.multivariate_normal
to give me the samples I want. Ideally, I want my x values between -0.51, and 51.2, and my y values between 0.33 and 51.6 (though I suppose equal ranges would be OK, since I can constrain the plot afterwards), but I'm not sure what mean (0, 0?) and covariance values I should be using to get these samples from the function.
推荐答案
顾名思义,numpy.random.multivariate_normal
生成正态分布,这意味着在任何给定间隔之外寻找点的可能性不为零.您可以生成相关的均匀分布,但这有点复杂. 在此处中查看.
As the name implies numpy.random.multivariate_normal
generates normal distributions, this means that there is a non-null probability of finding points outside of any given interval. You can generate correlated uniform distributions but this a little more convoluted. Take a look here for two possible methods.
如果要使用正态分布,可以设置sigma,以使半间隔对应3个标准差(如果需要,还可以滤除不良点).这样,您将在间隔内获得约99%的点,例如:
If you want to go with the normal distribution you can set up the sigmas so that your half-interval correspond to 3 standard deviations (you can also filter out the bad points if needed). In this way you will have ~99% of your points inside your interval, ex:
import numpy as np
from matplotlib.pyplot import scatter
xx = np.array([-0.51, 51.2])
yy = np.array([0.33, 51.6])
means = [xx.mean(), yy.mean()]
stds = [xx.std() / 3, yy.std() / 3]
corr = 0.8 # correlation
covs = [[stds[0]**2 , stds[0]*stds[1]*corr],
[stds[0]*stds[1]*corr, stds[1]**2]]
m = np.random.multivariate_normal(means, covs, 1000).T
scatter(m[0], m[1])
这篇关于使用Numpy生成随机相关的x和y点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!