使用莫顿阶进行最近邻搜索的好处? [英] Benefits of nearest neighbor search with Morton-order?

查看:182
本文介绍了使用莫顿阶进行最近邻搜索的好处?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在模拟粒子相互作用时,我偶然发现了莫顿阶(Z阶)的网格索引(

While working on the simulation of particle interactions, I stumbled across grid indexing in Morton-order (Z-order)(Wikipedia link) which is regarded to provide an efficient nearest neighbor cell search. The main reason that I've read is the almost sequential ordering of spatially close cells in memory.

在第一个实现的中间,我无法集中精力了解如何有效地为最近的邻居实现算法,特别是与基本的统一网格相比.

Being in the middle of a first implementation, I can not wrap my head around how to efficiently implement the algorithm for the nearest neighbors, especially in comparison to a basic uniform grid.

  1. 给定一个像元(x,y),获得8个相邻像元索引并计算各自的z索引是很简单的.尽管这提供了对元素的恒定访问时间,但是必须计算z索引或在预定义表中查找z索引(每个轴和OR'ing分开).这怎么可能更有效率?是真的,以顺序访问数组A中的元素说A [0]-> A

  1. Given a cell (x,y) it is trivial to obtain the 8 neighbor cell indices and compute the respective z-index. Although this provides constant access time to the elements, the z-index has either to be calculated or looked up in predefined tables (separate for each axis and OR'ing). How can this possibly be more efficient? Is it true, that accessing elements in an array A in an order say A[0] -> A1 -> A[3] -> A[4] -> ... is more efficient than in an order A[1023] -> A[12] -> A[456] -> A[56] -> ...?

我期望有一种更简单的算法来查找z顺序中最接近的邻居.大致思路:找到邻居的第一个单元,进行迭代.但这不是真的,因为这仅在2 ^ 4大小的块内有效.但是,存在两个问题:当单元不在边界上时,可以轻松确定该块的第一个单元并遍历该块中的单元,但是必须检查该单元是否是最近的邻居.当单元位于边界上时,更糟糕的是,必须考虑2 ^ 5个单元.我在这里想念什么?是否有一个相对简单有效的算法可以满足我的需要?

I've expected that there exists a simpler algorithm to find the nearest neighbors in z-order. Something along the lines: find first cell of neighbors, iterate. But this can't be true, as this works nicely only within 2^4 sized blocks. There are two problems however: When the cell is not on the boundary, one can easily determine the first cell of the block and iterate through the cells in the block, but one has to check whether the cell is a nearest neighbor. Worse is the case when the cell lies on the boundary, than one has to take into account 2^5 cells. What am I missing here? Is there a comparatively simple and efficient algorithm that will do what I need?

第1点中的问题很容易测试,但是我对所描述的访问模式所生成的基本指令不是很熟悉,并且我真的很想了解幕后的情况.

The question in point 1. is easily testable, but I'm not very familiar with the underlying instructions that the described access pattern generates and would really like to understand what is going on behind the scenes.

在此先感谢您的帮助,参考资料等.



谢谢您澄清第一点!因此,使用Z排序,有趣的是平均提高了相邻单元的高速缓存命中率.有没有一种方法可以配置缓存命中率/未命中率?

Thanks in advance for any help, references, etc...



Thank you for clarifying point 1! So, with Z-ordering, the cache hit rate is increased on average for neighbor cells, interesting. Is there a way to profile cache hit/miss rates?

关于要点2: 我应该补充一点,我理解如何为R ^ d中的点云构建Morton排序的数组,其中索引i = f(x1,x2,...,xd)是通过按位交织等获得的.了解是否有比下面的幼稚ansatz更好的方法来获取最近的邻居(此处为d = 2,伪代码"):

Regarding point 2: I should add that I understand how to build the Morton-ordered array for a point cloud in R^d where the index i = f(x1, x2, ..., xd) is obtained from bitwise interlacing etc. What I try to understand is whether there is a better way than the following naive ansatz to get the nearest neighbors (here in d=2, "pseudo code"):

// Get the z-indices of cells adjacent to the cell containing (x, y) 
// Accessing the contents of the cells is irrelevant here
(x, y) \elem R^2    
point = (x, y)
zindex = f(x, y)     
(zx, zy) = f^(-1)(zindex)          // grid coordinates 
nc = [(zx - 1, zy - 1), (zx - 1, zy), (zx - 1, zy + 1),  // neighbor grid 
      (zx    , zy - 1),               (zx,     zy + 1),  // coordinates
      (zx + 1, zy - 1), (zx + 1, zy), (zx + 1, zy + 1)]

ni= [f(x[0], x[1]) for x in nc]    // neighbor indices

推荐答案

在现代的基于多级缓存的计算机系统中,空间局部性是优化对数据元素的访问时间的重要因素.

In modern multi-level cache-based computer systems, spacial locality is an important factor in optimising access-time to data elements.

简单地说,这意味着如果您访问内存中的一个数据元素,然后访问附近内存中的另一个数据元素(其地址与第一个地址接近)会比访问数据便宜几个数量级.元素很远.

Put simply, this means if you access a data element in memory, then accessing another data element in memory that is nearby (has an address that is close to the first) can be cheaper by several orders of magnitude that accessing a data element that is far away.

当按顺序访问1-d数据时(例如在简单的图像处理或声音处理中),或以相同的方式遍历处理每个元素的数据结构,则将这些数据元素按顺序排列在内存中往往会达到空间局部性-即在访问元素N之后访问元素N + 1,这两个元素应该在内存中彼此相邻放置.

When 1-d data is accessed sequentially, as in simply image processing or sound processing, or iterating over data structures processing each element the same way, then arranging the data elements in memory in order tends to achieve spatial locality - i.e. since you access element N+1 just after accessing element N, the two elements should be placed next to each other in memory.

标准c数组(和许多其他数据结构)具有此属性.

Standard c arrays (and many other data structures) have this property.

Morton排序的重点是支持以下方案:按维度 2 访问数据,而不是按维度一个访问数据.换句话说,在访问元素(x,y)之后,您可以继续访问(x + 1,y)或(x,y + 1)或类似内容.

The point of Morton ordering is to support schemes where data is accessed two dimensionally instead of one dimensionally. In other words, after accessing element (x,y), you may go on to access (x+1,y) or (x,y+1) or similar.

Morton排序意味着(x,y),(x + 1,y)和(x,y + 1)在内存中彼此靠近.在标准c多维数组中,不一定是这种情况.例如,在数组myArray [10000] [10000]中,(x,y)和(x,y + 1)相隔10000个元素-相距太远,无法利用空间局部性.

The Morton ordering means that (x,y), (x+1,y) and (x,y+1) are near to each other in memory. In a standard c multidimensional array, this is not necessarily the case. For example, in the array myArray[10000][10000], (x,y) and (x,y+1) are 10000 elements apart - too far apart to take advantage of spatial locality.

在Morton排序中,标准c数组仍可以用作数据存储,但是计算出(x,y)不再像store [x + y * rowsize]那样简单的计算

In a Morton ordering, a standard c array can still be used as a store for the data, but the calculation to work out where (x,y) is is no longer as simple as store[x+y*rowsize].

要使用Morton排序实现您的应用程序,您需要弄清楚如何将坐标(x,y)转换为商店中的地址.换句话说,您需要一个功能f(x,y),该功能可以像store[f(x,y)]一样用于访问商店.

To implement your application using Morton ordering, you need to work out how to transform a coordinate (x,y) into the address in the store. In other words, you need a function f(x,y) that can be used to access the store as in store[f(x,y)].

您似乎需要做更多研究-请访问Wikipedia页面上的链接,尤其是BIGMIN函数上的链接.

Looks like you need to do some more research - follow the links from the wikipedia page, particularly the ones on the BIGMIN function.

这篇关于使用莫顿阶进行最近邻搜索的好处?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆