用Python编写的朋友之友算法需要在Fortran 90/95中 [英] Friends-of-friends algorithm written in Python need to be in Fortran 90/95

查看:180
本文介绍了用Python编写的朋友之友算法需要在Fortran 90/95中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试编写自己的朋友之友算法。该算法作用于一组三维数据点并返回数据集中晕圈的数量。每个晕都是距离小于链接长度的点的集合,b是程序唯一的参数。

算法描述:
FOF算法有一个称为链接长度的自由参数。任何两个相距小于或等于链接长度的粒子称为朋友。然后通过一组粒子来定义FOF组,该组中的每个粒子通过朋友网络与该组中的每个其他粒子相连。



设置FOF组计数器j = 1。


  • 粒子,n,尚未与任何组相关联:


  • 将n赋值给j组,初始化一个新的成员列表mlist,第一个条目,

  • 递归地,对于mlist中的每个新粒子p:


  • 查找p的邻居位于小于或等于链接长度的距离内,将未分配给组j的那些添加到mlist,
  • 为组j记录mlist,设置j = j + 1。
  • li>


这是我对代码算法的尝试。我在这样做的唯一语言是Python。但是,我需要使用Fortran编写此代码或使其更快。我真的希望有人能帮助我。

首先,我生成一组应该模仿3个光晕的点:

 从随机导入中导入随机
*
从数学导入导入数学
*
从numpy导入导入numpy
*
进口时间

分数= 1000

光环= [0,100。,150。]

x = []
y = []
z = []
id = []
为arange(0,points,1)中的i:
x.append(halos [0] + random() )
y.append(halos [0] + random())
z.append(halos [0] + random())
id.append(i)

for a in arange(points,points * 2,1):
x.append(halos [1] + random())
y.append(halos [1] + random() )
z.append(halos [1] + random())
id.append(i)

for a in arange(points * 2,points * 3,1 ):
x.append(halos [2] + random())
y.append(halos [2] + random())
z.append(晕动[2] +随机())
id.append(i)

然后我编码FOF算法: / p>

($)$ b $ =数组($)
t0 = time.time()

id_grp = []
groups = zeros((len(x),1))。tolist()
particles = id
b = 1#链接长度
,而len(粒子)> 0:
index = particles [0]
#从粒子列表中移除粒子
粒子.remove(index)
groups [index] = [index]
print#N,index
dx = xx [index]
dy = yy [index]
dz = zz [index]
dr = sqrt(dx ** 2. + dy ** 2. + dz ** 2。)
id_to_look = where(dr id_to_look.remove(index)
nlist = id_to_look
#从粒子列表中删除所有邻居
为n列表中的i:
if(i in粒子):
particles.remove(i)
print - > nelist
groups [index] = groups [index] + nlist
new_nlist = nlist
while len(new_nlist)> 0:
index_n = new_nlist [0]
new_nlist.remove(index_n)
print----> neigh,index_n
dx = xx [index_n]
dy = yy [index_n]
dz = zz [index_n]
dr = sqrt(dx ** 2. + dy * * 2 + dz ** 2)
id_to_look = where(dr id_to_look = list(set(id_to_look)& set(particles))
nlist = id_to_look
if(len(nlist)== 0):
print找不到新的邻居
else:
groups [index] = groups [index] + nlist
new_nlist = new_nlist + nlist
print------> neigh-neigh,new_nlist
为n列表中的k:
particles.remove(k)

最后,最后列出一个列表 groups



的代码是有点偏离主题,但我认为这将是很好的展示给你。我基本上删除所有没有粒子的组,根据粒子的数量排序并显示一些属性。

  def select(test,list):
selected = []
列表中的项目:
如果测试(item)== True:
selected.append(item)
return selected b
$ b groups = select(lambda x:sum(x)> 0.,groups)
#排序组
groups.sort(lambda x,y:cmp(len(x),len(y)))
groups.reverse()

print time.time() - t0,seconds

mass = x
for a in arange(0,len(groups),1):
total_mass = sum([ mass [j] for j in groups [i]])
x_cm = sum([mass [j] * x [j] fo rj in groups [i]])/ total_mass
y_cm = sum([mass [j] * y [j] for group in [i]])/ total_mass
z_cm = sum([mass [对于组[i]中的j]] / total_mass
dummy_x_cm = [x]中的j [x] -y_cm for j in groups [i]]
dummy_z_cm = [z [j] -z_cm for j in groups [i]]
dummy_x_cm = array(dummy_x_cm)
dummy_y_cm = array(dummy_y_cm )
dummy_z_cm = array(dummy_z_cm)
dr = max(sqrt(dummy_x_cm ** 2)+ dummy_y_cm ** 2. + dummy_z_cm ** 2。))
dummy_x_cm = max(dummy_x_cm)
dummy_y_cm = max(dummy_y_cm)
dummy_z_cm = max(dummy_z_cm)
print i,len(groups [i]),x_cm,y_cm,z_cm,dummy_x_cm,dummy_y_cm,dummy_z_cm


解决方案

我认为你不应该开始学习Fortran希望得到的代码将比您当前的实现更快。它最终可能是,但我认为最好是在考虑用另一种语言,特别是外语实现之前,尽可能快地实现Python实现。



我个人写的是Fortran,我个人认为它的表现在Python中很受欢迎,但了解这些内容的人提供了一些有说服力的论点,如果精心制作的话,Python + SciPy + Numpy可以在许多计算内核中与Fortran相匹配科学/工程计划。不要忘记,在计算机上的所有内核运行得很热之前,您还没有优化您的Python。



所以:

1st - 在Python中获得工作实现。



第二步 - 尽可能快地实现您的实现。



IF(大写字母,因为它是一个很大的'if')代码仍然不够快,将其翻译为编译语言的成本/收益是有利的,那么考虑将其转换为哪种编译语言。如果你在Fortran被广泛使用的领域,那么尽量学习Fortran,但这是一种利基语言,它可能会让你的职业更受益于学习C ++或其亲属之一。



编辑(太长以至于无法放入评论框)为什么在您的问题中误导我们?你声明你所熟悉的唯一语言是Python,现在你说你知道Fortran。我想你一定对它不舒服。并且,从您的评论看来,似乎您真正需要的帮助是让您的Python实施更快; Sideshow Bob 提供了一些建议。考虑到这一点,然后平行。

I'm trying to write my own code for a 'friends-of-friends' algorithm. This algorithm acts on a set of 3d data points and returns the number of 'halos' in the dataset. Each halo is a collection of point whose distance is less than the linking length, b, the only parameter of the program.

Algorithmic Description: The FOF algorithm has a single free parameter called the linking length. Any two particles that are separated by a distance less than or equal to the linking length are called "friends". The FOF group is then defined by the set of particles for which each particle within the set is connected to every other particle in the set through a network of friends.

Set FOF group counter j=1.

  • For each particle, n, not yet associated with any group:

  • Assign n to group j, initialize a new member list, mlist, for group j with particle n as first entry,

  • Recursively, for each new particle p in mlist:

  • Find neighbors of p lying within a distance less than or equal to the linking length, add to mlist those not already assigned to group j,
  • Record mlist for group j, set j=j+1.

This is my attempt to code the algorithm. The only language I'm comfortable in doing this is Python. However, I need this code to be written in Fortran or make it faster. I really hope someone would help me.

First I generate a set of points that should mimic the presence of 3 halos:

import random
from random import *
import math
from math import *
import numpy
from numpy import *
import time

points = 1000

halos=[0,100.,150.]

x=[]
y=[]
z=[]
id=[]
for i in arange(0,points,1):
   x.append(halos[0]+random())
   y.append(halos[0]+random())
   z.append(halos[0]+random())
   id.append(i)

for i in arange(points,points*2,1):
   x.append(halos[1]+random())
   y.append(halos[1]+random())
   z.append(halos[1]+random())
   id.append(i)

for i in arange(points*2,points*3,1):
   x.append(halos[2]+random())
   y.append(halos[2]+random())
   z.append(halos[2]+random())
   id.append(i)

Then I coded the FOF algorithm:

  x=array(x)
  y=array(y)
  z=array(z)
  id=array(id)

  t0 = time.time()                         

  id_grp=[]
  groups=zeros((len(x),1)).tolist()
  particles=id
  b=1 # linking length
  while len(particles)>0:
  index = particles[0]
  # remove the particle from the particles list
  particles.remove(index)
  groups[index]=[index]
  print "#N ", index
  dx=x-x[index]
  dy=y-y[index]
  dz=z-z[index]
  dr=sqrt(dx**2.+dy**2.+dz**2.)
  id_to_look = where(dr<b)[0].tolist()
  id_to_look.remove(index)
  nlist = id_to_look
  # remove all the neighbors from the particles list
  for i in nlist:
        if (i in particles):
           particles.remove(i)
  print "--> neighbors", nlist
  groups[index]=groups[index]+nlist
  new_nlist = nlist
  while len(new_nlist)>0:
          index_n = new_nlist[0]
          new_nlist.remove(index_n)
          print "----> neigh", index_n
          dx=x-x[index_n]
          dy=y-y[index_n]
          dz=z-z[index_n]
          dr=sqrt(dx**2.+dy**2.+dz**2.)
          id_to_look = where(dr<b)[0].tolist()
          id_to_look = list(set(id_to_look) & set(particles))
          nlist = id_to_look
          if (len(nlist)==0):
             print "No new neighbors found"
          else:
             groups[index]=groups[index]+nlist
             new_nlist=new_nlist+nlist
             print "------> neigh-neigh", new_nlist
             for k in nlist:
               particles.remove(k)

At the end one ends up with a list of the halos in the list groups

This part of the code is a bit off topic but I thought it would be nice to show it to you. I am basically deleting all the groups with no particles, sorting them according to the number of particles and showing some properties.

  def select(test,list):
  selected = []
  for item in list:
    if test(item) == True:
      selected.append(item)
  return selected

  groups=select(lambda x: sum(x)>0.,groups)
  # sorting groups
  groups.sort(lambda x,y: cmp(len(x),len(y)))
  groups.reverse()

  print time.time() - t0, "seconds"

  mass=x
  for i in arange(0,len(groups),1):
    total_mass=sum([mass[j] for j in groups[i]])
    x_cm = sum([mass[j]*x[j] for j in groups[i]])/total_mass
    y_cm = sum([mass[j]*y[j] for j in groups[i]])/total_mass
    z_cm = sum([mass[j]*z[j] for j in groups[i]])/total_mass
    dummy_x_cm = [x[j]-x_cm for j in groups[i]]
    dummy_y_cm = [y[j]-y_cm for j in groups[i]]
    dummy_z_cm = [z[j]-z_cm for j in groups[i]]
    dummy_x_cm = array(dummy_x_cm)
    dummy_y_cm = array(dummy_y_cm)
    dummy_z_cm = array(dummy_z_cm)
    dr = max(sqrt(dummy_x_cm**2.+dummy_y_cm**2.+dummy_z_cm**2.))
    dummy_x_cm = max(dummy_x_cm)
    dummy_y_cm = max(dummy_y_cm)
    dummy_z_cm = max(dummy_z_cm)
    print i, len(groups[i]), x_cm, y_cm, z_cm,dummy_x_cm,dummy_y_cm,dummy_z_cm

解决方案

I think that you would be ill-advised to start off learning Fortran in the hope that the resulting code will be faster than your current implementation. It may, eventually, be, but I think you would be better advised to make your Python implementation as fast as you can before thinking of implementing in another language, especially a foreign language.

I write Fortran, and personally I think its performance pisses all over Python, but people who know about these things provide compelling arguments that Python+SciPy+Numpy can, if carefully crafted, match Fortran for speed in the computational kernels of many scientific/engineering programs. Don't forget that you haven't optimised your Python until all the cores on your computer are running red hot.

So:

1st - get a working implementation in Python.

2nd - make your implementation as fast as possible.

IF (capital letters because it's a big 'if') the code is still not fast enough and the cost/benefit of translating it to a compiled language is favourable THEN consider which compiled language to translate it in to. If you are in a field where Fortran is widely used, then learn Fortran by all means, but it is something of a niche language and it may benefit your career more to learn C++ or one of its relatives.

EDIT (too long to fit in the comment box)

Why mislead us in your question ? You state that the only language you are comfortable with is Python, now you say that you know Fortran. I guess you must be uncomfortable with it. And, from your comment, it seems that maybe the help you really need is in making your Python implementation faster; Sideshow Bob has offered some advice. Take that into consideration, then parallelise.

这篇关于用Python编写的朋友之友算法需要在Fortran 90/95中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆