MYSQL Geo Search具有远距离性能 [英] MYSQL Geo Search having distance performance

查看:168
本文介绍了MYSQL Geo Search具有远距离性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在我的网站上有一个用于搜索的mysql select语句,当该网站真的很忙时出现性能问题.下面的查询从具有10万条记录的表中搜索广告,该表位于给定经度和纬度的25英里内,并按距离排序.里程数可能会因用户选择的不同而不同.

I have a mysql select statement for a search on my website which is having performance problems when the site gets really busy. The query below searches for adverts from a table with over 100k records, within 25 miles of the given lat and lon and sorts by distance. The number of miles can differ as it is selected by the user.

问题在于我认为它很慢,因为它是对表中的所有记录进行计算,而不是对经纬度25英里以内的记录进行计算.是否可以修改此查询,以便where子句仅选择25英里内的广告?我已经阅读了有关边界框和空间索引的信息,但是我不确定如何将其应用于此查询,我是否需要添加一个where子句来选择记录纬度和经度25英里半径的记录,该怎么做?

The problem is that I think it is slow because it does the calculations for all records in the table rather than ones that are within 25 miles of the lat and lon. Is it possible to amend this query so that where clause selects only adverts within 25 miles? Ive read about bounding box's and spatial indexes but im not sure how to apply them to this query, do I need to add a where clause that selects records 25 miles radius of the lat and lon, how do I do this?

SELECT 
    adverts.*, 
    round(sqrt((((adverts.latitude - '53.410778') * (adverts.latitude - '53.410778')) * 69.1 * 69.1) + ((adverts.longitude - '-2.97784') * (adverts.longitude - '-2.97784') * 53 * 53)), 1) as distance
FROM 
    adverts
WHERE 
    (adverts.type_id = '3')
HAVING 
    DISTANCE < 25
ORDER BY 
    distance ASC 
LIMIT 120,10

已更新为包括表架构,请注意,表更复杂,查询也是如此,但是我删除了此问题不需要的内容.

Updated to include table schema, please note that table is more complicated and so is the query but I have removed the things which aren't necessary for this problem.

CREATE TABLE `adverts` (
`advert_id` int(10) NOT NULL AUTO_INCREMENT,
`type_id` tinyint(1) NOT NULL,
`headline` varchar(50) NOT NULL,
`description` text NOT NULL,
`price` int(4) NOT NULL,
`postcode` varchar(7) NOT NULL,
`latitude` float NOT NULL,
`longitude` float NOT NULL,
PRIMARY KEY (`advert_id`),
KEY `latlon` (`latitude`,`longitude`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8

当我在mysql语句上做解释时,行数设置为67900,这比25英里半径范围内的行数要多得多,另外的行数设置为在哪里使用;使用文件排序".

when i do an explain on the mysql statement the number of rows is set to 67900 which is a lot more than is in a 25 mile radius, also the extra is set to "Using where; Using filesort".

查询耗时0.3秒,这确实很慢,尤其是当网站每秒收到大量请求时.

The query takes 0.3 seconds which is really slow, especially when the websites gets lots of requests per second.

推荐答案

最快的方法是使用MySQL的地理空间扩展,这已经很容易了,因为您已经在使用MyISAM表.这些扩展的文档可以在这里找到: http://dev .mysql.com/doc/refman/5.6/en/spatial-extensions.html

The fastest way to do this is to use the geospatial extensions for MySQL, which should be easy enough as you are already using a MyISAM table. The documentation for these extensions can be found here: http://dev.mysql.com/doc/refman/5.6/en/spatial-extensions.html

添加一个具有POINT数据类型的新列:

Add a new column with a POINT datatype:

ALTER TABLE `adverts` 
ADD COLUMN `geopoint` POINT NOT NULL AFTER `longitude`
ADD SPATIAL KEY `geopoint` (`geopoint`)

然后您可以从现有的纬度和经度字段填充此列:

You can then populate this column from your existing latitude and longitude fields:

UPDATE `adverts` 
SET `geopoint` = GeomFromText(CONCAT('POINT(',`latitude`,' ',`longitude`,')'));

下一步是根据输入的纬度和经度创建一个边界框,该边界将在您的WHERE子句中用作CONTAINS约束.您将需要根据所需的搜索区域和给定的起点,确定一组满足您要求的X,Y POINT坐标.

The next step is to create a bounding box based on the input latitude and longitude that will be used in your WHERE clause as a CONTAINS constraint. You will need to determine a set of X,Y POINT coordinates that work for your requirements based on the desired search area and given starting point.

您的最终查询将搜索搜索POLYGON中的所有POINT数据,然后您可以使用距离计算来进一步优化和排序数据:

Your final query will search for all POINT data that is within your search POLYGON, and you can then use a distance calculation to further refine and sort your data:

SELECT a.*, 
    ROUND( SQRT( ( ( (adverts.latitude - '53.410778') * (adverts.latitude - '53.410778') ) * 69.1 * 69.1 ) + ( (adverts.longitude - '-2.97784') * (adverts.longitude - '-2.97784') * 53 * 53 ) ), 1 ) AS distance
FROM adverts a
WHERE a.type_id = 3
AND CONTAINS(a.geopoint, GeomFromText('Polygon((0 0,0 3,3 3,3 0,0 0))'))
HAVING distance < 25
ORDER BY distance DESC
LIMIT 0, 30

请注意,上面的GeomFromText('Polygon((0 0,0 3,3 3,3 0,0 0))')不起作用,您需要在搜索开始周围用有效点替换坐标.如果希望纬度/经度发生变化,则应考虑使用触发器使POINT数据和关联的SPATIAL KEY保持最新.对于大型数据集,与为每个记录计算距离并使用HAVING子句进行过滤相比,您应该看到性能大大提高.我亲自定义了用于确定距离和创建边界POLYGON的函数.

Note that the GeomFromText('Polygon((0 0,0 3,3 3,3 0,0 0))') in the above will not work, you will need to replace the coordinates with valid points around your search start. If you expect the lat/long to change, you should consider using a trigger to keep the POINT data and associated SPATIAL KEY up to date. With large datasets you should see vastly improved performance over calculating a distance for every record and filtering using a HAVING clause. I personally defined functions for use in determining the distance and creating the bounding POLYGON.

这篇关于MYSQL Geo Search具有远距离性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆