在R中创建2D仓 [英] Creating 2D bins in R

查看:101
本文介绍了在R中创建2D仓的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在R中有坐标数据,我想确定我的点所在的分布.点的整个空间是边长为100的正方形.

I have coordinate data in R, and I would like to determine a distribution of where my points lie. The entire space of points is a square of side length 100.

我想将点分配给广场上的不同线段,例如四舍五入到最接近的5.我已经看到了使用cutfindinterval的示例,但是我不确定在创建时如何使用它2d容器.

I'd like to assign points to different segments on the square, for example rounded to the nearest 5. I've seen examples using cut and findinterval but i'm not sure how to use this when creating a 2d bin.

实际上,我想做的是使分布平滑,以便在网格的相邻区域之间没有大的跳跃.

Actually, what I want to be able to do is smooth the distribution so there are not huge jumps in between neighboring regions of the grid.

例如(这只是为了说明问题):

For example (this is just meant to illustrate the problem):

set.seed(1)
x <- runif(2000, 0, 100)
y <- runif(2000, 0, 100)
plot(y~x)
points( x = 21, y = 70, col = 'red', cex = 2, bg = 'red')

红点显然是在偶然没有很多其他点的区域中,因此此处的密度将是邻近区域的密度的跳跃,我希望能够对此进行平滑处理

the red point is clearly in a region that by chance hasn't had many other points, so the density here would be a jump from the density of the neighbouring regions, I'd like to be able to smooth this out

推荐答案

您可以使用ash库中的bin2函数获取合并的数据.

You can get the binned data using the bin2 function in the ash library.

关于红点附近区域数据稀疏性的问题,一种可能的解决方案是使用平均移位的直方图.将直方图移动几次并平均仓数后,它将对您的数据进行仓位.这减轻了垃圾箱起源的问题.例如,想象一下,如果红点位于垃圾箱的左上角或垃圾箱的右下角,则包含红点的垃圾箱中的点数将如何变化.

Regarding the problem of the sparsity of data in the region around the red point, one possible solution is with the average shifted histogram. It bins your data after shifting the histogram several times and averaging the bin counts. This alleviates the problem of the bin origin. e.g., imagine how the number of points in the bin containing the red point changes if the red point is the topleft of the bin or the bottom right of the bin.

library(ash)
bins <- bin2(cbind(x,y))
f <- ash2(bins, m = c(10,10))

image(f$x,f$y,f$z)
contour(f$x,f$y,f$z,add=TRUE)

如果您想要更平滑的容器,则可以尝试增加参数m,该参数是长度2的向量,用于控制每个变量的平滑参数.

If you would like smoother bins, you could try increasing the argument m, which is a vector of length 2 controlling the smoothing parameters along each variable.

f2 <- ash2(bins, m = c(10,10))
image(f2$x, f2$y, f2$z)
contour(f2$x,f2$y,f2$z,add=TRUE)

比较ff2

分箱算法在fortran中实现,并且非常快.

The binning algorithm is implemented in fortran and is very fast.

这篇关于在R中创建2D仓的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆