谷歌地图api v3的服务器端集群 [英] Server-side clustering for google maps api v3

查看:144
本文介绍了谷歌地图api v3的服务器端集群的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在开发一种Google地图概览小部件,它在地图上显示位置作为标记。标记的数量从几百到几千个标记(10000个)不等。现在,我正在使用 MarkerClusterer用于Google地图v3 1.0 Google地图javascript api v3(总理),它可以让我们说百个标记。由于标记数量增加的事实,我需要一种聚类标记的新方法。从我读到的唯一方法来保持性能,就是将群集从客户端转移到服务器端。有没有人知道一个好的PHP5库,能够为我完成这项工作?



Atm我深入挖掘Google地图的图层机制。也许还有一些我可以开始检查的领先的PHP图书馆?我也碰到了FusionTables,但是由于我需要集群化,我认为这可能不是正确的解决方案。



预先感谢!

解决方案

我不知道会为您完成这项工作的服务器端库。然而,我可以给你一些关于如何实现自己的指示。



聚类的基本方法是计算标记之间的距离,当两个标记之间的距离足够接近,您可以用位于两者之间中点的单个标记替换它们。



不仅仅限于接近每个其他标记可能是,你也可以(或相反)选择限制你想要的结果集群/标记的数量。



为了达到这个目的,你可以计算距离所有标记对之间的距离,对它们进行排序,然后从顶部合并,直到只有您想要的标记/簇数量增加为止。



要细化中点在形成群集时进行定位可能需要考虑要合并的两个实体标记的数量。把这个数字想象成一个重量,并将两个标记之间的界限作为一个尺度。然后,不要总是选择中点,而要选择平衡比例的点。



我猜测这种简单的形式如果您的标记数量有限,聚类足够好。如果您的数据集(标记数量及其位置)大致为静态,您可以偶尔计算服务器上的集群,缓存它,并直接从缓存中对服务器客户端进行计算。



然而,如果您需要支持全球范围内可能使用标记的大规模场景,则需要更复杂的方法。



所提及的群集算法不包含规模。事实上,它的计算成本通常会随着标记的数量呈指数级增长。为了弥补这一点,您可以将世界分割成分区并计算集群并为每个分区的客户端提供服务。这确实可以支持扩展,因为工作负载可以分割并由几个(大致)独立的服务器执行。

然后是如何找到一个好的分区方案。您可能还想考虑在不同的缩放级别提供不同的标记群集,您的分区方案也应该包含此功能以允许缩放。



Google将地图划分为多个tile其中x,y和z坐标,其中 x y 是从地图西北角开始的瓦片的水平和垂直位置,并且其中 z 是缩放级别。



在最小缩放级别(零)下,整个地图由一个单独的图块组成。 (所有瓷砖都是256x256像素)。在下一个缩放级别中,该图块被分成四个子图块。这继续下去,因此在缩放级别2中,这四个拼贴中的每一个都被分成四个子拼贴,这总共给了我们16个拼贴。缩放级别3有64个图块,级别4有256个图块,依此类推。 (任何缩放级别上的贴图数量可以表示为 4 ^ z 。)



使用此分区方案您可以计算从最低缩放级别(最高Z坐标)开始的每个tile的聚类,冒泡直到达到顶端。

要进行聚类的标记集一个瓦片是其四个子瓦片的所有标记(其中一些可能代表聚类)的并集。

这为您提供了有限的计算成本,并且还为您提供了一种分块发送给客户端的数据的好方法。客户可以在加载到地图中时逐个地请求标记,而不是为给定缩放级别请求所有标记( )。



然而,这种方法存在一个缺陷:考虑两个相邻的瓦片,一个在左边,一个在右边。如果左侧图块右侧包含标记/群集,右侧图块最左侧包含标记/群集,则应合并这两个标记/群集,但不会因为我们正在执行群集为了解决这个问题,你可以对它们进行聚类后进行后期处理,这样你就可以合并每个图块上的标记/聚类考虑给定瓦片的八个相邻瓦片中的每一个。这种后合并机制只有在我们可以假定没有足够大的来影响不在同一个子图块中的周围标记的情况下才会起作用。然而,这是一个合理的假设。

最后说明:通过缩小的方法,您可以让客户提出几个小的请求。这些请求将具有局部性(即,瓦片不是随机请求的,而是彼此在地理上彼此接近的瓦片通常也一起访问)。为了改进查找/查询你可以从使用同样具有本地属性的搜索关键字(代表图块)中受益(因为这会将相邻图块的数据存储在磁盘上的相邻数据块中,从而提高读取时间和缓存利用率)。

您可以使用分块/分块分区方案形成这样一个密钥。让顶部图块(跨越整个地图的单个图块)具有空字符串作为关键字。接下来,让它的每个子贴片都有键A,B,C和D.下一层将有键AA,AB,AC,AD,BA,BC,...,DC,DD。



以递归方式应用此选项,您最终将得到一个用于标识拼贴的分区键,可以快速转换为x,y,z坐标并具有locality属性。这种关键的命名方案有时被称为 Quad Key ,因为分区方案形成了一个 Quad Tree 。当使用Z-阶曲线将2D值映射为1D值时,局部属性与您获得的相同。



请让我知道您是否需要更多细节。


I am currently developing a kind of google maps overview widget that displays locations as markers on the map. The amount of markers varies from several hundreds up to thousands of markers (10000 up). Right now I am using MarkerClusterer for google maps v3 1.0 and the google maps javascript api v3 (premier) and it works pretty decent for lets say a hundred markers. Due to the fact that the number of markers will increase I need a new way of clustering the markers. From what I read the only way to keep the performance up is moving the clustering from the client-side to the server-side. Does anyone know a good PHP5 library which is able to get this done for me?

Atm I am digging deeper into the layer mechanisms of google maps. Maybe there are also a few leading PHP librarys I could start to check out? I also ran across FusionTables but since I need clustering I think this might not be the right solution.

Thanks in advance!

解决方案

I don't know of a server-side library that'll do the job for you. I can however give you some pointers on how to implement one yourself.

The basic approach to clustering is simply to calculate the distance between your markers and when two of them are close enough you replace them with a single marker located at the mid-point between the two.

Instead of just having a limitation on how close to each other markers may be, you may also (or instead) choose to limit the number of clusters/markers you want as a result.

To accomplish this you could calculate the distance between all pairs of markers, sort them, and then merge from the top until you only have as many markers/clusters as you wish.

To refine the mid-point positioning when forming a cluster you may take into account the number of actual markers represented by each of the two to be merged. Think of that number as a weight and the line between the two markers as a scale. Then instead of always choosing the mid-point, choose the point that would balance the scale.

I'd guess that this simple form of clustering is good enough if you have a limited number of markers. If your data set (# of markers and their position) is roughly static you can calculate clustering on the server once in a while, cache it, and server clients directly from the cache.

However, if you need to support large scale scenarios potentially with markers all over the world you'll need a more sophisticated approach.

The mentioned cluster algorithm does not scale. In fact its computation cost would typically grow exponentially with the number of markers.

To remedy this you could split the world into partitions and calculate clustering and serve clients from each partition. This would indeed support scaling since the workload can be split and performed by several (roughly) independent servers.

The question then is how to find a good partitioning scheme. You may also want to consider providing different clustering of markers at different zoom levels, and your partitioning scheme should incorporate this as well to allow scaling.

Google divide the map into tiles with x, y and z-coordinates, where x and y are the horizontal and vertical position of the tile starting from the north-west corner of the map, and where z is the zoom level.

At the minimum zoom level (zero) the entire map consist of a single tile. (all tiles are 256x256 pixels). At the next zoom level that tile is divided into four sub tiles. This continues, so that in zoom level 2 each of those four tiles has been divided into four sub tiles, which gives us a total of 16 tiles. Zoom level 3 has 64 tiles, level 4 has 256 tiles, and so on. (The number of tiles on any zoom level can be expressed as 4^z.)

Using this partitioning scheme you could calculate clustering per tile starting at the lowest zoom level (highest z-coordinate), bubbling up until you reach the top.

The set of markers to be clustered for a single tile is the union of all markers (some of which may represent clusters) of its four sub tiles.

This gives you a limited computational cost and also gives you a nice way of chunking up the data to be sent to the client. Instead of requesting all markers for a given zoom level (which would not scale) clients can request markers on a tile-by-tile basis as they are loaded into the map.

There is however a flaw in this approach: Consider two adjacent tiles, one to the left and one to the right. If the left tile contains a marker/cluster at its far right side and the right tile contains a marker/cluster at its far left side, then those two markers/clusters should be merged but won't be since we're performing the clustering mechanism for each tile individually.

To remedy this you could post-process tiles after they have been clustered so that you merge markers/clusters that lay on the each of the four edges, taking into account each of the eight adjacent tiles for a given tile. This post-merging mechanism will only work if we can assume that no single cluster is large enough to affect the surrounding markers which are not in the same sub tile. This is, however, a reasonable assumption.

As a final note: With the scaled out approach you'll have clients making several small requests. These requests will have locality (i.e. tiles are not randomly requested, but instead tiles that are geographically close to each other are also typically accessed together).

To improve lookup/query performance you would benefit from using search keys (representing the tiles) that also have this locality property (since this would store data for adjacent tiles in adjacent data blocks on disk - improving read time and cache utilization).

You can form such a key using the tile/sub tile partitioning scheme. Let the top tile (the single one spanning the entire map) have the empty string as key. Next, let each of its sub tiles have the keys A, B, C and D. The next level would have keys AA, AB, AC, AD, BA, BC, ..., DC, DD.

Apply this recursively and you'll end up with a partitioning key that identifies your tiles, allows quick transformation to x,y,z-coordinates and has the locality property. This key naming scheme is sometimes called a Quad Key stemming from the fact that the partitioning scheme forms a Quad Tree. The locality property is the same as you get when using a Z-order curve to map a 2D-value into a 1D-value.

Please let me know if you need more details.

这篇关于谷歌地图api v3的服务器端集群的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆