在python中将500,000个地理空间点聚类 [英] Clustering 500,000 geospatial points in python

查看：312 发布时间：2020/10/3 1:59:44 python cluster-analysis geospatial

本文介绍了在python中将500,000个地理空间点聚类的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我目前面临的问题是寻找一种在python中将500,000个纬度/经度对聚类的方法。到目前为止，我已经尝试使用numpy计算距离矩阵（以传递到scikit学习DBSCAN中），但是输入量如此之大，它会迅速吐出内存错误。

I'm currently faced with the problem of finding a way to cluster around 500,000 latitude/longitude pairs in python. So far I've tried computing a distance matrix with numpy (to pass into the scikit-learn DBSCAN) but with such a large input it quickly spits out a Memory Error.

这些点存储在元组中，该元组包含该点的纬度，经度和数据值。

The points are stored in tuples containing the latitude, longitude, and the data value at that point.

简而言之，在python中对大量纬度/经度对进行空间聚类的最有效方法是什么？对于这个应用程序，我愿意以速度为名牺牲一些精度。

In short, what is the most efficient way to spatially cluster a large number of latitude/longitude pairs in python? For this application, I'm willing to sacrifice some accuracy in the name of speed.

编辑：
寻找算法的簇数未知

The number of clusters for the algorithm to find is unknown ahead of time.

推荐答案

我没有您的数据，所以我只生成了500k随机数分成三列。

I don't have your data so I just generated 500k random numbers into three columns.

import numpy as np
import matplotlib.pyplot as plt
from scipy.cluster.vq import kmeans2, whiten

arr = np.random.randn(500000*3).reshape((500000, 3))
x, y = kmeans2(whiten(arr), 7, iter = 20)  #<--- I randomly picked 7 clusters
plt.scatter(arr[:,0], arr[:,1], c=y, alpha=0.33333);

out[1]:

我为此设置了时间，并且花了1.96秒来运行此Kmeans2，因此我认为这与您的数据大小无关。将数据放入500000 x 3 numpy数组中，然后尝试kmeans2。

I timed this and it took 1.96 seconds to run this Kmeans2 so I don't think it has to do with the size of your data. Put your data in a 500000 x 3 numpy array and try kmeans2.

这篇关于在python中将500,000个地理空间点聚类的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在python中将500,000个地理空间点聚类 [英] Clustering 500,000 geospatial points in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

在python中将500,000个地理空间点聚类 [英] Clustering 500,000 geospatial points in python

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭