使用Matlab进行中心极限定理的PDF和CDF图 [英] PDF and CDF plot for central limit theorem using Matlab

查看:325
本文介绍了使用Matlab进行中心极限定理的PDF和CDF图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在努力绘制其中

Sn = X1 + X2 + X3 + .... + Xn 使用中心极限定理,其中n = 1; 2; 3; 4; 5; 10; 20; 40 我将Xi用作(0,3)之间的值的统一连续随机变量.

Sn=X1+X2+X3+....+Xn using central limit theorem where n = 1; 2; 3; 4; 5; 10; 20; 40 I am taking Xi to be a uniform continuous random variable for values between (0,3).

Here is what i have done so far - 
close all
%different sizes of input X
%N=[1 5 10 50];
N = [1 2 3 4 5 10 20 40];

%interval (1,6) for random variables
a=0;
b=3;

%to store sum of differnet sizes of input
for i=1:length(N)
    %generates uniform random numbers in the interval
    X = a + (b-a).*rand(N(i),1);
    S=zeros(1,length(X));
    S=cumsum(X);
    cd=cdf('Uniform',S,0,3);
    plot(cd);
    hold on;
end
legend('n=1','n=2','n=3','n=4','n=5','n=10','n=20','n=40');
title('CDF PLOT')
figure;

for i=1:length(N)
%generates uniform random numbers in the interval
    X = a + (b-a).*rand(N(i),1);
    S=zeros(1,length(X));
    S=cumsum(X);
    cd=pdf('Uniform',S,0,3);
    plot(cd);
    hold on;
end
legend('n=1','n=2','n=3','n=4','n=5','n=10','n=20','n=40');
title('PDF PLOT')

我的输出远远没有达到我期望的任何帮助.

My output is nowhere near what I am expecting any help is much appreciated.

推荐答案

这可以通过使用 rand() cumsum() .

This can be done with vectorization using rand() and cumsum().

例如,下面的代码生成Uniform(0,3)分布的10000个样本的40个副本,并将其存储在X中.为了满足中心极限定理(CLT)的假设,它们是 cumsum() 将其转换为10000个Sn = X1 + X2 + ...副本其中第一行是Sn = X1n = 10000副本,第五行是S_5 = X1 + X2 + X3 + X4 + X5n副本.最后一行是S_40n个副本.

For example, the code below generates 40 replications of 10000 samples of a Uniform(0,3) distribution and stores in X. To meet the Central Limit Theorem (CLT) assumptions, they are independent and identically distributed (i.i.d.). Then cumsum() transforms this into 10000 copies of the Sn = X1 + X2 + ... where the first row is n = 10000copies of Sn = X1, the 5th row is n copies of S_5 = X1 + X2 + X3 + X4 + X5. The last row is n copies of S_40.

% MATLAB R2019a
% Setup
N = [1:5 10 20 40];    % values of n we are interested in
LB = 0;                % lowerbound for X ~ Uniform(LB,UB)
UB = 3;                % upperbound for X ~ Uniform(LB,UB)
n = 10000;             % Number of copies (samples) for each random variable

% Generate random variates
X = LB + (UB - LB)*rand(max(N),n);     % X ~ Uniform(LB,UB)    (i.i.d.)
Sn = cumsum(X); 

从图像中可以看到,在n = 2情况下,总和的确是Triangular(0,3,6)分布.对于n = 40情况,总和近似为正态分布(高斯),平均值为60(40*mean(X) = 40*1.5 = 60).这显示了概率密度函数(PDF)累积分布函数(CDF).

You can see from the image that the n = 2 case, the sum is indeed a Triangular(0,3,6) distribution. For the n = 40 case, the sum is approximately Normally distributed (Gaussian) with mean 60 (40*mean(X) = 40*1.5 = 60). This shows the convergence in distribution for both the probability density function (PDF) and the cumulative distribution function (CDF).

注意:CLT通常表示分布已趋于正态分布,且由于已移动,均值为零.通过从Sn中减去mean(Sn) = n*mean(X) = n*0.5*(LB+UB)来移位结果即可完成此操作.

Note: The CLT is often stated with convergence in distribution to a Normal distribution with zero mean as it has been shifted. Shifting the results by subtracting mean(Sn) = n*mean(X) = n*0.5*(LB+UB) from Sn gets this done.

以下代码不是黄金标准,但可以产生图像.

Code below isn't the gold standard but it produced the image.

figure
s(11) = subplot(6,2,1)  % n = 1
    histogram(Sn(1,:),'Normalization','pdf')
    title(s(11),'n = 1')
s(12) = subplot(6,2,2)
    cdfplot(Sn(1,:))
    title(s(12),'n = 1') 
s(21) = subplot(6,2,3)   % n = 2
    histogram(Sn(2,:),'Normalization','pdf')
    title(s(21),'n = 2')
s(22) = subplot(6,2,4)
    cdfplot(Sn(2,:))
    title(s(22),'n = 2') 
s(31) = subplot(6,2,5)  % n = 5
    histogram(Sn(5,:),'Normalization','pdf')
    title(s(31),'n = 5')
s(32) = subplot(6,2,6)
    cdfplot(Sn(5,:))
    title(s(32),'n = 5') 
s(41) = subplot(6,2,7)  % n = 10
    histogram(Sn(10,:),'Normalization','pdf')
    title(s(41),'n = 10')
s(42) = subplot(6,2,8)
    cdfplot(Sn(10,:))
    title(s(42),'n = 10') 
s(51) = subplot(6,2,9)   % n = 20
    histogram(Sn(20,:),'Normalization','pdf')
    title(s(51),'n = 20')
s(52) = subplot(6,2,10)
    cdfplot(Sn(20,:))
    title(s(52),'n = 20') 
s(61) = subplot(6,2,11)   % n = 40
    histogram(Sn(40,:),'Normalization','pdf')
    title(s(61),'n = 40')
s(62) = subplot(6,2,12)
    cdfplot(Sn(40,:))
    title(s(62),'n = 40') 
sgtitle({'PDF (left) and CDF (right) for Sn with n \in \{1, 2, 5, 10, 20, 40\}';'note different axis scales'})

for tgt = [11:10:61 12:10:62]
    xlabel(s(tgt),'Sn')
    if rem(tgt,2) == 1
        ylabel(s(tgt),'pdf')
    else                           %  rem(tgt,2) == 0
        ylabel(s(tgt),'cdf')
    end
end

用于绘图的关键功能: <基本MATLAB中的c16>() cdfplot() >从统计信息"工具箱中.请注意,这可以手动完成,而无需统计信息工具箱仅需几行即可获取cdf,然后只需调用 plot() .

Key functions used for plot: histogram() from base MATLAB and cdfplot() from the Statistics toolbox. Note this could be done manually without requiring the Statistics toolbox with a few lines to obtain the cdf and then just calling plot().

对于Sn的方差存在一些担忧.

There was some concern over the variance of Sn.

请注意,Sn的方差由(n/12)*(UB-LB)^2给出(以下推导).蒙特卡洛模拟显示我们的Sn样本确实具有正确的方差;实际上,随着n变大,它会收敛于此.只需调用var(Sn(40,:)).

Note the variance of Sn is given by (n/12)*(UB-LB)^2 (derivation below). Monte Carlo simulation shows our samples of Sn do have the correct variance; indeed, it converges to this as n gets larger. Simply call var(Sn(40,:)).

% with n = 10000
var(Sn(40,:))         % var(S_40) = 30   (will vary slightly depending on random seed)
(40/12)*((UB-LB)^2)   % 29.9505            

通过 S _40,您可以看到收敛非常好:

You can see the convergence is very good by S_40:

step = 0.01;
Domain = 40:step:80;

mu = 40*(LB+UB)/2;
sigma = sqrt((40/12)*((UB-LB)^2));

figure, hold on
histogram(Sn(40,:),'Normalization','pdf')
plot(Domain,normpdf(Domain,mu,sigma),'r-','LineWidth',1.4)
ylabel('pdf')
xlabel('S_n')

Sn的均值和方差的推导:


对于期望(均值),第二个等式由期望的线性保持.第三个等式成立,因为 X_i 的分布相同.


For the expectation (mean), the second equality holds by linearity of expectation. The third equality holds since X_i are identically distributed.

此版本的离散版本在此处发布.

这篇关于使用Matlab进行中心极限定理的PDF和CDF图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆