如何在GAE上最好地设计日期/地理邻近查询? [英] How to best design a date/geographic proximity query on GAE?

查看:95
本文介绍了如何在GAE上最好地设计日期/地理邻近查询?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用GA
web2py和一个Flex前端构建一个在GAE上寻找运动锦标赛的目录。用户从一组选项中选择位置,半径和最大
日期。我有这个查询实现的基本版本,但它是
低效和缓慢。我知道我可以改进它的一种方式是通过冷凝
我用来将对象组装成
批量查询的许多单个查询。我刚刚了解到这是可能的。但我也在考虑利用memcache进行更广泛的重新设计。



主要问题是我无法通过位置
查询数据存储,因为GAE在一个查询中将不允许多个数字比较语句
(<,<=,> =,>)。我已经在使用一个日期,我需要
两个来检查经纬度,所以这是一个不行。目前,
我的算法如下所示:

1。)按日期查询并选择

2.)使用geopy的距离模块中的目标函数来查找提供的距离的
max和min纬度和经度。


<3>。)循环遍历结果并删除全部最大/最小值以外的纬度/经度纬度/平方米

<4>。)再次循环并使用距离函数来检查确切的
距离,因为步骤2将包括一些外部区域半径。
在提供的距离之外移除结果(这是2/3/4组合
inefficent?)

5。)组合多对多列表并附加到对象(这是我
需要切换到批量操作的地方)。

6.)返回客户端



下面是我使用memcache的计划。让我知道是否在左边的
字段出来,因为我之前没有使用memcache或服务器
缓存的经验。

- 在缓存中填充一个表示所有
数据的地理对象的列表。它们有五个属性:纬度,经度,event_id,
event_type(预计会扩展到锦标赛之外)以及
start_date。此列表将按日期排序。



- 还在缓存中保留一个代表起始
的指针字典和缓存中的所有结尾索引日期范围我的应用程序使用(下
周,2周,3个月,6个月,年,2年)。

- 具有计划任务每天早上12点更新指针。

- 向缓存以及数据存储添加新的插入;更新
指针。



使用此设计,算法现在看起来像:

1。 )使用指针根据
提供的日期切分相应的列表。



<2-4>)与上面的算法相同,除了地理对象
p>

5.)使用批量操作来选择使用剩余地理位置
对象的event_ids



6的完整锦标赛。 )汇集多方合作关系



7。)返回客户



有关这种方法的想法?非常感谢您的阅读和任何建议,您
可以给。



-Dane

解决方案

您可能对 geohash 感兴趣,它可以让您执行这样的不等式查询:


选择经度,经度,标题FROM
myMarkers WHERE geohash> =:sw_geohash
AND geohash <=:ne_geohash




查看这篇精美的文章在本月的Google App Engine中发布了 App Engine社区更新博客帖子。



提出的设计,不要忘记,Memcache中的实体不能保证留在内存中,并且你不能拥有这些m按日期排序。


I'm building a directory for finding athletic tournaments on GAE with web2py and a Flex front end. The user selects a location, a radius, and a maximum date from a set of choices. I have a basic version of this query implemented, but it's inefficient and slow. One way I know I can improve it is by condensing the many individual queries I'm using to assemble the objects into bulk queries. I just learned that was possible. But I'm also thinking about a more extensive redesign that utilizes memcache.

The main problem is that I can't query the datastore by location because GAE won't allow multiple numerical comparison statements (<,<=,>=,>) in one query. I'm already using one for date, and I'd need TWO to check both latitude and longitude, so it's a no go. Currently, my algorithm looks like this:

1.) Query by date and select

2.) Use destination function from geopy's distance module to find the max and min latitude and longitudes for supplied distance

3.) Loop through results and remove all with lat/lng outside max/min

4.) Loop through again and use distance function to check exact distance, because step 2 will include some areas outside the radius. Remove results outside supplied distance (is this 2/3/4 combination inefficent?)

5.) Assemble many-to-many lists and attach to objects (this is where I need to switch to bulk operations)

6.) Return to client

Here's my plan for using memcache.. let me know if I'm way out in left field on this as I have no prior experience with memcache or server caching in general.

-Keep a list in the cache filled with "geo objects" that represent all my data. These have five properties: latitude, longitude, event_id, event_type (in anticipation of expanding beyond tournaments), and start_date. This list will be sorted by date.

-Also keep a dict of pointers in the cache which represent the start and end indices in the cache for all the date ranges my app uses (next week, 2 weeks, month, 3 months, 6 months, year, 2 years).

-Have a scheduled task that updates the pointers daily at 12am.

-Add new inserts to the cache as well as the datastore; update pointers.

Using this design, the algorithm would now look like:

1.) Use pointers to slice off appropriate chunk of list based on supplied date.

2-4.) Same as above algorithm, except with geo objects

5.) Use bulk operation to select full tournaments using remaining geo objects' event_ids

6.) Assemble many-to-manys

7.) Return to client

Thoughts on this approach? Many thanks for reading and any advice you can give.

-Dane

解决方案

You might be interested by geohash, which enables you to do an inequality query like this:

SELECT latitude, longitude, title FROM myMarkers WHERE geohash >= :sw_geohash AND geohash <= :ne_geohash

Have a look at this fine article which was featured in this month's Google App Engine App Engine Community Update blog post.

As a note on your proposed design, don't forget that entities in Memcache have no guarantee of staying in memory, and that you can not have them "sorted by date".

这篇关于如何在GAE上最好地设计日期/地理邻近查询?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆