如何用不同的分布生成相关变量的数据集? [英] how to generate a dataset of correlated variables with different distributions?
问题描述
corr2data
,但是它不会允许我指定要生成的变量的最大值和最小值,只是意味着sd和协方差矩阵。因此,我需要在生成数据后进行杂乱的调整。各种其他细节使我感到烦恼,其中 corr2data
。有没有一个更简单的方法来做这个与MATLAB?我不熟悉这个软件,因为我和Stata一样。如果您可以访问统计工具箱以及MATLAB你可以使用copula功能来做相当容易的操作。使用Copula,您可以指定每个变量的边际分布,以及变量之间的相关结构。
然后可以从copula生成随机数,使其适合数据等。
请参见MATLAB文档:
For teaching purposes, I need to generate random datasets of correlated random variables with different distributions. I have tried corr2data
in Stata but it will not allow me to specify max and min values of the variables to be generated, just means, sd's and the covariance matrix. Therefore, I need to do messy adjustments after generation of the data. Various other details annoy me with corr2data
. Is there a simpler way of doing this with MATLAB? I am not as familiar with this software as I am with Stata.
If you have access to Statistics Toolbox as well as MATLAB, you can use the copula functionality to do this fairly easily. Using a copula, you can specify the marginal distributions of each variable, and a correlation structure between the variables.
You can then generate random numbers from the copula, fit it to data etc. as well.
See in the MATLAB documentation:
Copulas: Generate Correlated Samples
这篇关于如何用不同的分布生成相关变量的数据集?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!