计算二维密度面上的点的概率 [英] Calculate probability of point on 2d density surface

查看:86
本文介绍了计算二维密度面上的点的概率的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我像本例中那样计算两个向量的2d密度曲面:

If I calculate the 2d density surface of two vectors like in this example:

library(MASS)
a <- rnorm(1000)
b <- rnorm(1000, sd=2)
f1 <- kde2d(a, b, n = 100)

我得到以下表面

filled.contour(f1)

z值是估计的密度.

我现在的问题是:是否可以计算单个点的概率,例如a = 1,b = -4

My question now is: Is it possible to calculate the probability of a single point, e.g. a = 1, b = -4

[由于我不是统计学家,所以这可能是错误的措词.对不起我想知道-如果这完全有可能-发生这种情况的可能性.]

[as I'm not a statistician this is maybe the wrong wording. Sorry for that. I would like to know - if this is possible at all - with which probability a point occurs.]

感谢您的每条评论!

推荐答案

如果指定区域,则该区域相对于密度函数而言具有概率.当然,单点的概率不等于零.但是在那个时候它确实有一个非零的密度.那是什么?

If you specify an area, then that area has a probability with respect to your density function. Of course a single point does not have a probability different from zero. But it does have a non-zero density at that point. What is that then?

密度是该区域上概率密度的积分的极限,该概率密度除以法线面积度量为零时的法线面积度量.(实际上很难正确地指出这一点,需要尝试几次,但这仍然不是最佳选择.)

The density is the limit of integral of that probability density integrated over the area divided by the normal area measure as the normal area measure goes to zero. (It was actual rather hard to state that correctly, needed a few tries and it is still not optimal).

所有这些实际上都是基本的演算.编写例程以计算该区域内密度的积分也相当容易,尽管我认为MASS具有使用更复杂的积分技术的标准方法.这是我根据您的示例汇总的快速例程:

All this is really basic calculus. It is also fairly easy to write a routine to calculate the integral of that density over the area, although I imagine MASS has standard ways to do it that use more sophisticated integration techniques. Here is a quick routine that I threw together based on your example:

library(MASS)
n <- 100
a <- rnorm(1000)
b <- rnorm(1000, sd=2)
f1 <- kde2d(a, b, n = 100)
lims <- c(min(a),max(a),min(b),max(b))

filled.contour(f1)

prob <- function(f,xmin,xmax,ymin,ymax,n,lims){
  ixmin <- max( 1, n*(xmin-lims[1])/(lims[2]-lims[1]) )
  ixmax <- min( n, n*(xmax-lims[1])/(lims[2]-lims[1]) )
  iymin <- max( 1, n*(ymin-lims[3])/(lims[4]-lims[3]) ) 
  iymax <- min( n, n*(ymax-lims[3])/(lims[4]-lims[3]) )
  avg <- mean(f$z[ixmin:ixmax,iymin:iymax])
  probval <- (xmax-xmin)*(ymax-ymin)*avg
  return(probval)
}
prob(f1,0.5,1.5,-4.5,-3.5,n,lims)
# [1] 0.004788993
prob(f1,-1,1,-1,1,n,lims)
# [1] 0.2224353
prob(f1,-2,2,-2,2,n,lims)
# [1] 0.5916984
prob(f1,0,1,-1,1,n,lims)
# [1] 0.119455
prob(f1,1,2,-1,1,n,lims)
# [1] 0.05093696
prob(f1,-3,3,-3,3,n,lims)
# [1] 0.8080565
lims
# [1] -3.081773  4.767588 -5.496468  7.040882

注意,该例程似乎是正确的,并且给出了合理的答案,但是它没有经过我为生产功能而进行的详细审查.

Caveat, the routine seems right and is giving reasonable answers, but it has not undergone anywhere near the scrutiny I would give it for a production function.

这篇关于计算二维密度面上的点的概率的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆