Matlab中的中心极限定理 [英] Central Limit Theorem in matlab

查看:116
本文介绍了Matlab中的中心极限定理的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图通过比较三个RV和正态分布之和的直方图来证明Matlab中的CLT.

这是我的代码:

  clc;清除;len = 50000;%y0:指数分布lambda = 3;y0 = -log(rand(1,len))./lambda;%y1:瑞利分布mu = 0;信号= 2;var1 = mu + sig * randn(1,len);var2 = mu + sig * randn(1,len);t1 = var1.^ 2;t2 = var2.^ 2;y1 = sqrt(t1 + t2);%%y2:正态分布y2 = randn(1,len);%y3:期望的结果是:mean0 =(sum(y0)+ sum(y1)+ sum(y2))/(len * 3);%如何计算呢?var0 = 1;%如何计算?y3 = mean0 + var0 * randn(1,len);增量= 0.1;x3 = min(y3):delta:max(y3);图('名称','正态分布');hist(y3,x3);%中心极限定理:%的结果是:res = y0 + y1 + y2;xn = min(res):delta:max(res);Figure('Name','Final Result');hist(res,xn); 

我有两个主要问题.

  1. 如何计算y3的均值和方差(结果应该是什么)

  2. 我的代码正确吗?

解决方案

由于 y0 y1 y2 是行向量,要做:

  mean0 =平均值([y0 y1 y2]);variance0 = var([y0 y1 y2]); 

当创建 [y0 y1 y2] 时,您正在创建一个大向量,其中所有先前的样本都在一个向量中(好像它们是一个单一分布的样本一样).

现在只需将其插入所需的功能(均值和方差),如上所示.


关于统计部分:我认为您做错了一些事情.中心极限定理适用于根据相同分布分布的变量总和.确实可以是任何分布D,但是所有变量都必须具有相同的分布D.您正在尝试对不同的分布求和.

定理说:

我编写了一个根据指数分布分布的变量示例.运行它,您会发现当增加N时,结果分布趋于预期的正态分布.对于N = 1,您具有指数分布(与正态分布非常不同),但是对于N = 100,您已经具有非常接近预期正态分布的分布(您可以看到均值和方差基本相同)现在).

CLT for N = 1的指数

CLT(N = 3的指数)

CLT(N = 10的指数)

CLT(N = 100的指数)

预期的正态分布(CLT的收敛分布)

  clc;清除;len = 50000;lambda = 3;%yA:指数分布AyA = -log(rand(1,len))./lambda;%yB:指数分布ByB = -log(rand(1,len))./lambda;%yC:指数分布CyC = -log(rand(1,len))./lambda;%yD:指数分布DyD = -log(rand(1,len))./lambda;%yE:指数分布EyE = -log(rand(1,len))./lambda;%yF:指数分布FyF = -log(rand(1,len))./lambda;%yG:指数分布GyG = -log(rand(1,len))./lambda;%yH:指数分布HyH = -log(rand(1,len))./lambda;%yI:指数分布IyI = -log(rand(1,len))./lambda;%yJ:指数分布JyJ = -log(rand(1,len))./lambda;%y1:您希望得到什么结果(中心高斯,与指数具有相同的变化):均值0 = 0;var0 = var(yA);y1 =平均值0 + sqrt(var0)* randn(1,len);增量= 0.01;x1 = min(y1):delta:max(y1);图('名称','正态分布(预期)');hist(y1,x1);%中心极限定理:%的结果是:res1 =((((yA)/1)-平均值(yA))* sqrt(1);res2 =((((yA + yB)/2)-平均值(yA))* sqrt(2);res3 =((((yA + yB + yC)/3)-平均值(yA))* sqrt(3);res4 =((((yA + yB + yC + yD)/4)-平均值(yA))* sqrt(4);res5 =((((yA + yB + yC + yD + yE)/5)-平均值(yA))* sqrt(5);res10 = (((yA+yB+yC+yD+yE+yF+yG+yH+yI+yJ)/10) - 平均值(yA))*sqrt(10);增量= 0.01;xn = min(res1):delta:max(res1);图('名称','N = 1的最终结果');您好st(res1,xn);xn = min(res2):delta:max(res2);图('名称','N = 2的最终结果');hist(res2,xn);xn = min(res3):delta:max(res3);图('名称','N = 3的最终结果');hist(res3,xn);xn = min(res4):delta:max(res4);图('名称','N = 4的最终结果');hist(res4,xn);xn = min(res5):delta:max(res5);图('名称','N = 5的最终结果');hist(res5,xn);xn = min(res10):delta:max(res10);图('名称','N = 10的最终结果');hist(res10,xn);N = 100时的%y100 = -log(rand(100,len))./lambda;res100 =((sum(y100)/100)-平均值(yA))* sqrt(100);xn =最小值(res100):增量:最大值(res100);图('名称','N = 100的最终结果');hist(res100,xn); 

I am trying to prove CLT in matlab by comparing histogram for sum of three RV and normal distribution.

Here is my code:

clc;clear;
len = 50000;

%y0 : Exponential Distribution
lambda = 3;
y0=-log(rand(1,len))./lambda;


%y1 :  Rayleigh Distribution
mu = 0;
sig = 2;
var1 = mu + sig*randn(1,len);
var2 = mu + sig*randn(1,len);
t1 = var1 .^ 2;
t2 = var2 .^ 2;
y1 = sqrt(t1+t2);


% %y2: Normal Distribution
y2 =  randn(1,len);


%y3 : What result excpected to be:
mean0 = (sum(y0)+ sum(y1)+ sum(y2)) / (len * 3);%how do I calculate this?
var0 =  1;%how do I calculate this?
y3 = mean0 + var0*randn(1,len);
delta = 0.1;
x3 = min(y3):delta:max(y3);
figure('Name','Normal Distribution');
hist(y3,x3);


%Central Limit Theorem:
%what result is:
res = y0+y1+y2;
xn = min(res):delta:max(res);
figure('Name','Final Result');
hist(res,xn);

I have two main problems.

  1. How can I calculate mean and variance for y3 (what result should be)

  2. Is my code correct?

解决方案

Since y0, y1 and y2 are row vectors, you have to do:

mean0 = mean([y0 y1 y2]);
variance0 = var([y0 y1 y2]);

When you create [y0 y1 y2] you are creating a big vector with all your previous samples in a single vector (As if they were samples form one single distribution).

Now just plug it into the functions you want (mean and variance) as showed above.


About the statistical part: I think you are getting some things wrong. The Central Limit Theorem applies for the sum of variables distributed according to a same distribution. It can be indeed be any distribution D, but all variables must have that same distribution D. You are trying to sum different distributions.

The theorem says:

I've coded an example for variables distributed according to an exponential distribution. Run it and you observe that when you increase N, the resulting distribution tends to the expected normal distribution. For N=1 you have your exponential distribution (very different from a normal distribution), but for N=100 you already have a distribution that is very close to the expected normal distribution (you can see how the mean and variance are basically the same now).

CLT for Exponentials with N=1

CLT for Exponentials with N=3

CLT for Exponentials with N=10

CLT for Exponentials with N=100

The expected normal distribution (convergence distibution of CLT)

clc;clear;
len = 50000;
lambda = 3;

%yA : Exponential Distribution A
yA=-log(rand(1,len))./lambda;

%yB : Exponential Distribution B
yB=-log(rand(1,len))./lambda;

%yC : Exponential Distribution C
yC=-log(rand(1,len))./lambda;

%yD : Exponential Distribution D
yD=-log(rand(1,len))./lambda;

%yE : Exponential Distribution E
yE=-log(rand(1,len))./lambda;

%yF : Exponential Distribution F
yF=-log(rand(1,len))./lambda;

%yG : Exponential Distribution G
yG=-log(rand(1,len))./lambda;

%yH : Exponential Distribution H
yH=-log(rand(1,len))./lambda;

%yI : Exponential Distribution I
yI=-log(rand(1,len))./lambda;

%yJ : Exponential Distribution J
yJ=-log(rand(1,len))./lambda;


%y1 : What result you expect it to be (centred Gaussian with same variation as exponential):
mean0 = 0;
var0 =  var(yA);
y1 = mean0 + sqrt(var0)*randn(1,len);
delta = 0.01;
x1 = min(y1):delta:max(y1);
figure('Name','Normal Distribution (Expected)');
hist(y1,x1);


%Central Limit Theorem:
%what result is:
res1 = (((yA)/1) - mean(yA))*sqrt(1);
res2 = (((yA+yB)/2) - mean(yA))*sqrt(2);
res3 = (((yA+yB+yC)/3) - mean(yA))*sqrt(3);
res4 = (((yA+yB+yC+yD)/4) - mean(yA))*sqrt(4);
res5 = (((yA+yB+yC+yD+yE)/5) - mean(yA))*sqrt(5);
res10 = (((yA+yB+yC+yD+yE+yF+yG+yH+yI+yJ)/10) - mean(yA))*sqrt(10);
delta = 0.01;
xn = min(res1):delta:max(res1);
figure('Name','Final Result for N=1');
hi  st(res1,xn);
xn = min(res2):delta:max(res2);
figure('Name','Final Result for N=2');
hist(res2,xn);
xn = min(res3):delta:max(res3);
figure('Name','Final Result for N=3');
hist(res3,xn);
xn = min(res4):delta:max(res4);
figure('Name','Final Result for N=4');
hist(res4,xn);
xn = min(res5):delta:max(res5);
figure('Name','Final Result for N=5');
hist(res5,xn);
xn = min(res10):delta:max(res10);
figure('Name','Final Result for N=10');
hist(res10,xn);

%for N = 100
y100=-log(rand(100,len))./lambda;
res100 = ((sum(y100)/100) - mean(yA))*sqrt(100);
xn = min(res100):delta:max(res100);
figure('Name','Final Result for N=100');
hist(res100,xn);

这篇关于Matlab中的中心极限定理的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆