`python`中的加权高斯核密度估计 [英] Weighted Gaussian kernel density estimation in `python`
问题描述
更新:scipy.stats.gaussian_kde
现在支持加权样本.请参阅
实施细节
加权算术平均值为
.
Update: Weighted samples are now supported by scipy.stats.gaussian_kde
. See here and here for details.
It is currently not possible to use scipy.stats.gaussian_kde
to estimate the density of a random variable based on weighted samples. What methods are available to estimate densities of continuous random variables based on weighted samples?
Neither sklearn.neighbors.KernelDensity
nor statsmodels.nonparametric
seem to support weighted samples. I modified scipy.stats.gaussian_kde
to allow for heterogeneous sampling weights and thought the results might be useful for others. An example is shown below.
An ipython
notebook can be found here: http://nbviewer.ipython.org/gist/tillahoffmann/f844bce2ec264c1c8cb5
Implementation details
The weighted arithmetic mean is
The unbiased data covariance matrix is then given by
The bandwidth can be chosen by scott
or silverman
rules as in scipy
. However, the number of samples used to calculate the bandwidth is Kish's approximation for the effective sample size.
这篇关于`python`中的加权高斯核密度估计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!