集成2D内核密度估计 [英] Integrate 2D kernel density estimate

查看:101
本文介绍了集成2D内核密度估计的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个点的x,y分布,我可以通过KDE. html"rel =" noreferrer> scipy.stats.gaussian_kde .这是我的代码以及输出的外观(可以从此处获得x,y数据):

I have a x,y distribution of points for which I obtain the KDE through scipy.stats.gaussian_kde. This is my code and how the output looks (the x,y data can be obtained from here):

import numpy as np
from scipy import stats

# Obtain data from file.
data = np.loadtxt('data.dat', unpack=True)
m1, m2 = data[0], data[1]
xmin, xmax = min(m1), max(m1)
ymin, ymax = min(m2), max(m2)

# Perform a kernel density estimate (KDE) on the data
x, y = np.mgrid[xmin:xmax:100j, ymin:ymax:100j]
positions = np.vstack([x.ravel(), y.ravel()])
values = np.vstack([m1, m2])
kernel = stats.gaussian_kde(values)
f = np.reshape(kernel(positions).T, x.shape)

# Define the number that will determine the integration limits
x1, y1 = 2.5, 1.5

# Perform integration?

# Plot the results:
import matplotlib.pyplot as plt
# Set limits
plt.xlim(xmin,xmax)
plt.ylim(ymin,ymax)
# KDE density plot
plt.imshow(np.rot90(f), cmap=plt.cm.gist_earth_r, extent=[xmin, xmax, ymin, ymax])
# Draw contour lines
cset = plt.contour(x,y,f)
plt.clabel(cset, inline=1, fontsize=10)
plt.colorbar()
# Plot point
plt.scatter(x1, y1, c='r', s=35)
plt.show()

坐标为(x1, y1)的红色点(与2D图中的每个点一样)具有一个由f(内核或KDE)给出的介于0到0.42之间的关联值.假设f(x1, y1) = 0.08.

The red point with coordinates (x1, y1) has (like every point in the 2D plot) an associated value given by f (the kernel or KDE) between 0 and 0.42. Let's say that f(x1, y1) = 0.08.

我需要将fxy中的积分限制进行积分,这些区域由f评估为小于的区域给出,即:f(x, y)<0.08.

I need to integrate f with integration limits in x and y given by those regions where f evaluates to less than f(x1, y1), ie: f(x, y)<0.08.

对于我已经看到的python可以通过数值积分执行函数和一维数组的积分,但是我还没有看到任何可以让我在2D上执行数值积分的东西.数组(f内核)此外,我不确定如何识别该特定条件(即:f(x, y)小于给定值)给出的区域

For what I've seen python can perform integration of functions and one dimensional arrays through numerical integration, but I haven't seen anything that would let me perform a numerical integration on a 2D array (the f kernel) Furthermore, I'm not sure how I would even recognize the regions given by that particular condition (ie: f(x, y)less than a given value)

这可以做到吗?

推荐答案

这里是使用蒙特卡洛积分的一种方法.这有点慢,并且解决方案中存在随机性.误差与样本量的平方根成反比,而运行时间与样本量成正比(其中样本量是指蒙特卡洛样本(在下面的示例中为10000),而不是数据集的大小).这是使用您的kernel对象的一些简单代码.

Here is a way to do it using monte carlo integration. It is a little slow, and there is randomness in the solution. The error is inversely proportional to the square root of the sample size, while the running time is directly proportional to the sample size (where sample size refers to the monte carlo sample (10000 in my example below), not the size of your data set). Here is some simple code using your kernel object.

#Compute the point below which to integrate
iso = kernel((x1,y1))

#Sample from your KDE distribution
sample = kernel.resample(size=10000)

#Filter the sample
insample = kernel(sample) < iso

#The integral you want is equivalent to the probability of drawing a point 
#that gets through the filter
integral = insample.sum() / float(insample.shape[0])
print integral

对于您的数据集,我得到大约0.2的答案.

I get approximately 0.2 as the answer for your data set.

这篇关于集成2D内核密度估计的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆