使用高斯核估计向量的pdf [英] Estimate pdf of a vector using Gaussian Kernel

查看:157
本文介绍了使用高斯核估计向量的pdf的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用高斯核基于等式估算数据的pdf 其中K(.)是高斯核,数据是给定的向量. z为1到256之间的bin.bin的大小为1.

I am using Gaussian kernel to estimate a pdf of a data based on the equation where K(.) is Gaussian kernel, data is a given vector. z is bin from 1 to 256. size of bin is 1.

我由matlab代码实现.但是,结果表明我的pdf估计值(蓝色)的幅度与实际的pdf数据并不相似.您能看到我的代码并给我一些有关我的代码的评论吗?

I implemented by matlab code. However, the result show the amplitude of my pdf estimation (blue color) is not similar with real pdf of data. Could you see my code and give me some comment about my code?

MATLAB代码

function pdf_est=KDE()
close all;
%%Random values of 20 pixels, range=[1 256]
data=randi([1 256],1,20);

%% Estimate histogram%%%%% 
pdf_est=zeros(1,256);
z=256;

for i=1:z
    for j=1:length(data)
        pdf_est(i)=pdf_est(i)+Gaussian(i-data(j));
    end
end
%% Plot real histogram 1 to 256; binsize=1;
hold on
plot(imhist(uint8(data))./length(data),'r');
%% Plot histogram estimation
plot(pdf_est./length(data),'b');
hold off
function K=Gaussian(x)
   sigma=1;
   K=1./(sqrt(2*pi)*sigma)*exp(-x^2./(2*sigma^2));

结果蓝色是我的结果,红色是真实的pdf

RESULT BLUE is my result and RED is real pdf

推荐答案

您有两个问题:

  1. 蓝色和红色图之间的1单位位移.
  2. 蓝色的穗状花序比红色的穗状花序宽且高.

如何解决每个问题:

  1. 这是由于数据范围 0,...,255和索引间隔 1,...,256之间可能引起的混淆引起的.由于您的数据代表一个8位图像,因此值应为0,...,255(而不是1,...,256).您绘制的水平轴应为0,...,255. for行中的i变量也是如此.然后,由于Matlab索引从1开始,因此在索引pdf_est时应使用i+1.

  1. This is caused by a possible confusion between the data range 0,...,255 and the indexing interval 1,...,256. Since your data represents an 8-bit image, values should be 0,...,255 (not 1,...,256). Your plotted horizontal axis should then be 0,...,255. Same goes for the i variable in the for line. And then, since Matlab indexing starts at 1, you should use i+1 when indexing pdf_est.

这是正常行为.您假设内核中的单位方差.要查看更高的蓝色尖峰,可以减小sigma以使内核更窄和更高.但是您永远不会获得与数据完全相同的高度(必要的sigma取决于数据).

This is normal behaviour. You are assuming unit variance in your kernel. To see taller blue spikes you could reduce sigma to make the kernel narrower and taller. But you will never get the exact same height as your data (the necessary sigma would depend on your data).

实际上,您在高度和宽度之间进行了权衡,由sigma控制.但是重要的是,对于任何sigma区域均保持不变.因此,我建议绘制CDF(区域)而不是pdf(区域密度).为此,请绘制累积直方图和pdf(使用 cumsum ).

Actually, you have a tradeoff between height and width, controlled by sigma. But the important thing is that the area remains the same for any sigma. So I suggest plotting the CDF (area) instead of the pdf (area densisty). To do that, plot the accumulated histogram and pdf (using cumsum).

根据1修改的代码

function pdf_est=KDE()
close all;
%%Random values of 20 pixels, range=[1 256]
data=randi([1 256],1,20)-1; %// changed: "-1"

%% Estimate histogram%%%%% 
pdf_est=zeros(1,256);
z=256;

for i=0:z-1 %// changed_ subtracted 1 
    for j=1:length(data)
        pdf_est(i+1)=pdf_est(i+1)+Gaussian(i-data(j)); %// changed: "+1" (twice)
    end
end
%% Plot real histogram 1 to 256; binsize=1;
hold on
plot(0:255, imhist(uint8(data))./length(data),'r'); %// changed: explicit x axis
%% Plot histogram estimation
plot(0:255, pdf_est./length(data),'b'); %// changed: explicit x axis
hold off
function K=Gaussian(x)
   sigma=1; %// change? Set as desired
   K=1./(sqrt(2*pi)*sigma)*exp(-x^2./(2*sigma^2));

根据1和2修改的代码

function pdf_est=KDE()
close all;
%%Random values of 20 pixels, range=[1 256]
data=randi([1 256],1,20)-1; %// changed: "-1"

%% Estimate histogram%%%%% 
pdf_est=zeros(1,256);
z=256;

for i=0:z-1 %// changed: subtracted 1 
    for j=1:length(data)
        pdf_est(i+1)=pdf_est(i+1)+Gaussian(i-data(j)); %// changed: "+1" (twice)
    end
end
%% Plot real histogram 1 to 256; binsize=1;
hold on
plot(0:255, cumsum(imhist(uint8(data))./length(data)),'r'); %// changed: explicit x axis
                                                            %// changed: cumsum
%% Plot histogram estimation
plot(0:255, cumsum(pdf_est./length(data)),'b'); %// changed: explicit x axis
                                                %// changed: cumsum
hold off
function K=Gaussian(x)
   sigma=1; %// change? Set as desired
   K=1./(sqrt(2*pi)*sigma)*exp(-x^2./(2*sigma^2));

这篇关于使用高斯核估计向量的pdf的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆