寻找一个好的空间分区数据结构,从中快速生成数百万个原子键 [英] Looking for a good space-partitioning data structure to generate millions of atomic bonds quickly from

查看:233
本文介绍了寻找一个好的空间分区数据结构,从中快速生成数百万个原子键的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在执行涉及数百万个原子系统的一些MD模拟。



我写了一些代码来生成一个文件,它只是一个XYZ原子坐标。现在我需要在原子之间产生键。



示例XYZ文件:



pre> 1 0 0
2 0 0
7 0 0
10 0 0
9 0 0

所以我有五个原子。如果我的距离阈值是2单位,那么我的债券列表将是:

  1 2 
3 5
4 5

(其中数字对应于XYZ文件中坐标的索引) / p>

生成此列表的朴素方法是:

  for i = 1:numAtoms 
for j = i + 1:numAtoms
如果距离(atom [i],atom [j] 2
bonds.push [i,j]

并且即使在对于数百万原子的高度优化的C中也是缓慢的,至少与我将进行这个过程一样频繁。



分区数据结构是使用kd-tree,当我写一个光子映射器一次,所以我真的不知道什么是这个问题的最佳解决方案。



我还应该提到我的模拟框是周期性的,这意味着一个原子在(0.5,0 ,0)将与距离阈值(例如2)的(boxWidth-0.5,0,0)处的原子键合。

解决方案

简单的解决方案是第一个尝试。它们可以快速编写代码,并且易于测试。



因此,您可以通过分配来严格修剪搜索空间网格坐标到你的原子。没有技术。像一个穷人的八叉树...



所有你需要做的是网格大小为2,并搜索局部网格和每个相邻网格中的所有原子。显然,网格是3D的。



显然,你做一个初步的传递,并创建一个列表(或向量)属于每个单元格的原子。您可以将列表存储在由3D网格坐标索引的映射中。然后对于每个原子,你可以查找本地列表,并进行债券测试。



此外,不要使用平方根作为距离。用距离平方来操作。这将节省处理的桶负载。


I'm performing some MD simulations involving systems of millions of atoms.

I have written some code to generate a file which is just a listing of XYZ atom coordinates. Now I need to generate bonds between the atoms. If two atoms are within a certain distance of each other, that is considered a bond.

Example XYZ file:

1 0 0
2 0 0
7 0 0
10 0 0
9 0 0

So I've got five atoms. If my distance threshold is 2 units, then my bond listing will be:

1 2
3 5
4 5

(where the numbers correspond to the index of the coordinate in the XYZ file).

The naive approach to generate this list is just:

for i = 1:numAtoms
    for j = i+1:numAtoms
        if distance(atom[i], atom[j]) < 2
            bonds.push [i, j]

However, this quickly hits an algorithmic limit and is slow even in highly optimized C for millions of atoms, at least for as frequently as I'll be doing this process.

The only experience I have with space-partitioning data structures is with kd-trees when I wrote a photon mapper once, so I don't really know what the best solution to this problem is. I'm sure there's probably something out there that's optimal for this though.

I should also mention that my simulation box is periodic, meaning that an atom at (0.5, 0, 0) will bond to an atom at (boxWidth - 0.5, 0, 0) with a distance threshold such as 2.

解决方案

Simple solutions are the first to try. They are quick to code, and easy to test. If that does not give you the required performance, then you can try something trickier.

So, you can seriously prune the search space by assigning grid co-ordinates to your atoms. Nothing technical. Like a poor-man's octree...

All you need to do is have a grid size of 2, and search all atoms within the local grid and each neighbouring grid. Obviously the grid is 3D. An atom's grid location is nothing more complex than dividing each its co-ordinates by the grid size.

Obviously you do a preliminary pass and create a list (or vector) of atoms belonging to each cell. You can store the lists in a map indexed by the 3D grid co-ordinate. Then for each atom, you can just lookup the local lists and do the bond tests.

Also, don't use square-root for your distance. Operate with distance squared instead. This will save a bucket-load of processing.

这篇关于寻找一个好的空间分区数据结构,从中快速生成数百万个原子键的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆