Python:加快地理比较 [英] Python: speeding up geographic comparison

查看：88 发布时间：2020/5/21 20:28:43 python optimization distance geography

本文介绍了Python:加快地理比较的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我写了一些代码，其中包括一个嵌套循环，其中内部循环执行了约150万次.我在此循环中有一个函数正在尝试优化.我已经做了一些工作，并得到了一些结果，但是我需要一点输入来检查我在做的事情是否明智.

I've written some code that includes a nested loop where the inner loop is executed about 1.5 million times. I have a function in this loop that I'm trying to optimize. I've done some work, and got some results, but I need a little input to check if what I'm doing is sensible.

一些背景:

我有两个地理点(纬度，经度)集合，一个相对较小的集合，一个相对较大的集合.对于小型集合中的每个点，我需要找到大型集合中的最近点.

I have two collections of geographic points (latitude, longitude), one relatively small collection and one relatively huge collection. For every point in the small collection, I need to find the closest point in the large collection.

执行此操作的明显方法是使用Haversine公式.这样做的好处是距离绝对准确.

The obvious way to do this would be to use the haversine formula. The benefit here is that the distances are definitely accurate.

from math import radians, sin, cos, asin, sqrt

def haversine(point1, point2):
    """Gives the distance between two points on earth.
    """
    earth_radius_miles = 3956
    lat1, lon1 = (radians(coord) for coord in point1)
    lat2, lon2 = (radians(coord) for coord in point2)
    dlat, dlon = (lat2 - lat1, lon2 - lon1)
    a = sin(dlat/2.0)**2 + cos(lat1) * cos(lat2) * sin(dlon/2.0)**2
    great_circle_distance = 2 * asin(min(1,sqrt(a)))
    d = earth_radius_miles * great_circle_distance
    return d

但是，在我的机器上运行150万次大约需要9秒钟(根据timeit).由于准确的距离并不重要，而我只需要找到最接近的点，我决定尝试其他一些功能.

However, running this 1.5 million times takes about 9 seconds on my machine (according to timeit). Since having an accurate distance is unimportant, rather I only need to find the closest point, I decided to try some other functions.

毕达哥拉斯定理的简单实现使我的速度提高了约30％.考虑到我可以做得更好，我写了以下内容:

A simple implementation of the pythagorean theorem gives me a speedup of about 30%. Thinking that I can do better, I wrote the following:

def dumb(point1, point2):
    lat1, lon1 = point1
    lat2, lon2 = point2
    d = abs((lat2 - lat1) + (lon2 - lon1))

这使我提高了10倍.但是，现在我担心这将无法保留三角形不等式.

which gives me a factor of 10 improvement. However, now I'm worried that this will not preserve the triangle inequality.

所以，我的最后一个问题是两个方面:我想拥有一个运行速度与dumb一样快但仍然正确的函数. dumb可以工作吗?如果没有，关于如何改善我的haversine功能有什么建议吗?

So, my final question is two fold: I'd like to have a function that runs as fast as dumb but still be correct. Will dumb work? If not, any suggestions on how to improve my haversine function?

Python:加快地理比较 [英] Python: speeding up geographic comparison

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Python:加快地理比较 [英] Python: speeding up geographic comparison

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭