在数据库中聚类Lat / Longs [英] Clustering Lat/Longs in a Database

查看:174
本文介绍了在数据库中聚类Lat / Longs的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想看看是否有人知道如何使用数据库对一些Lat / Long结果进行聚类,以减少通过线路发送到应用程序的结果数量。



有关如何集群的一些资源,无论是在客户端OR或在服务器(应用程序)端,但不在数据库端:(



这是一个类似的问题,由SO同事提出。解决方案是服务器端的(即C#代码后面)。



有没有任何运气或经验解决这个问题,但在一个数据库?有什么数据库大师



编辑1:澄清 - 通过聚类,我希望将 x 分组到一个点的一个区域。所以,如果我说的是在1英里/ 1公里的方块中的所有结果,那么正方形的所有结果都是GROUP'D到一个结果(说...正方形的中间)。



编辑2:我使用的是MS Sql 2008,但我很开心听到如果在其他DB的其他解决方案。

解决方案

我可能会使用 k - 使用您的点的笛卡尔坐标(例如WGS-84 ECF)进行聚类。它很容易实现&快速收敛,适应您的数据,无论它看起来像什么。此外,您可以选择 k 以满足您的带宽要求,每个群集将具有相同数量的关联点(模k)。



我将创建一个集群质心的表,并向原始数据表中添加一个字段,以指示它属于哪个集群。你显然希望定期更新集群,如果你的数据是动态的。我不知道你是否可以用一个存储过程&触发,但也许。



*修改将调整计算的质心向量的长度,以便它们在地球表面。否则,你最终会得到一些负的高度点(当转换回LLH时)。


I'm trying to see if anyone knows how to cluster some Lat/Long results, using a database, to reduce the number of results sent over the wire to the application.

There are a number of resources about how to cluster, either on the client side OR in the server (application) side .. but not in the database side :(

This is a similar question, asked by a fellow S.O. member. The solutions are server side based (ie. C# code behind).

Has anyone had any luck or experience with solving this, but in a database? Are there any database guru's out there who are after a hawt and sexy DB challenge?

please help :)

EDIT 1: Clarification - by clustering, i'm hoping to group x number of points into a single point, for an area. So, if i say cluster everything in a 1 mile / 1 km square, then all the results in that 'square' are GROUP'D into a single result (say ... the middle of the square).

EDIT 2: I'm using MS Sql 2008, but i'm open to hearing if there are other solutions in other DB's.

解决方案

I'd probably use a modified* version of k-means clustering using the cartesian (e.g. WGS-84 ECF) coordinates for your points. It's easy to implement & converges quickly, and adapts to your data no matter what it looks like. Plus, you can pick k to suit your bandwidth requirements, and each cluster will have the same number of associated points (mod k).

I'd make a table of cluster centroids, and add a field to the original data table to indicate what cluster it belonged too. You'd obviously want to update the clustering periodically if your data is at all dynamic. I don't know if you could do that with a stored procedure & trigger, but perhaps.

*The "modification" would be to adjust the length of the computed centroid vectors so they'd be on the surface of the earth. Otherwise you'd end up with a bunch of points with negative altitude (when converted back to LLH).

这篇关于在数据库中聚类Lat / Longs的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆