庞大的SQL Server数据库进行距离计算 [英] Distance Calculation with huge SQL Server database

查看:124
本文介绍了庞大的SQL Server数据库进行距离计算的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个庞大的企业数据库(约有500,000个),其中包含邮政编码,地址等.我需要通过从100英里的用户邮政编码升序显示它们.我有一张邮政编码表,其中包含相关的纬度和经度.什么是更快/更好的解决方案?

I have a huge database of businesses (about 500,000) with zipcode, address etc . I need to display them by ascending order from 100 miles are of users zipcode. I have a table for zipcodes with related latitude and longitude. What will be faster/better solution ?

情况1:计算距离并按距离排序.我将在会话中为用户提供当前的邮政编码,纬度和经度.我将使用SQL Server函数计算距离.

Case 1: to calculate distance and sort by distance. I will have users current zipcode, latitude and longitude in session. I will calculate distance using a SQL Server function.

情况2:获取50英里范围内的所有邮政编码,并使用这些邮政编码获得企业.在这里,我将不得不在查找企业时在嵌套查询中编写一个选择.

Case 2: to get all zipcodes in 50 miles area and get businesses with all those zipcodes. Here I will have to write a select in nested query while finding businesses.

我认为案例1将计算数据库中所有企业的距离.而第二种情况只会获取邮政编码,最终只会获取所需的业务.因此,情况2应该更好吗?我希望在这里提出任何建议.

I think case 1 will calculate distance for all businesses in database. While 2nd case will just fetch zipcodes and will end up fetching only required businesses. Hence case 2 should be better? I would appreciate any suggestion here.

这是我针对情况1的LINQ查询.

Here is LINQ query I have for case 1.

var businessListQuery = (from b in _DB.Businesses
                         let distance = _DB.CalculateDistance(b.Zipcode,userLattitude,userLogntitude)
                         where b.BusinessCategories.Any(bc => bc.SubCategoryId == subCategoryId)
                                         && distance < 100
                         orderby distance
                         select new BusinessDetails(b, distance.ToString()));

int totalRecords = businessListQuery.Count();
var ret = businessListQuery.ToList().Skip(startRow).Take(pageSize).ToList();

在旁注应用程序中使用C#.

On a side note app is in C# .

谢谢

推荐答案

比看GEOGRAPHY数据类型可能做得更糟,例如:

You could do worse than look at the GEOGRAPHY datatype, for example:

CREATE TABLE Places
(
    SeqID       INT IDENTITY(1,1),
    Place       NVARCHAR(20),
    Location    GEOGRAPHY
)
GO
INSERT INTO Places (Place, Location) VALUES ('Coventry', geography::Point(52.4167, -1.55, 4326))
INSERT INTO Places (Place, Location) VALUES ('Sheffield', geography::Point(53.3667, -1.5, 4326))
INSERT INTO Places (Place, Location) VALUES ('Penzance', geography::Point(50.1214, -5.5347, 4326))
INSERT INTO Places (Place, Location) VALUES ('Brentwood', geography::Point(52.6208, 0.3033, 4326))
INSERT INTO Places (Place, Location) VALUES ('Inverness', geography::Point(57.4760, -4.2254, 4326))
GO
SELECT p1.Place, p2.place, p1.location.STDistance(p2.location) / 1000 AS DistanceInKilometres
    FROM Places p1
    CROSS JOIN Places p2
GO  
SELECT p1.Place, p2.place, p1.location.STDistance(p2.location) / 1000 AS DistanceInKilometres
    FROM Places p1
        INNER JOIN Places p2 ON p1.SeqID > p2.SeqID
GO  

geography::Point采用纬度和经度以及SRID(特殊参考ID号).在这种情况下,SRID是4326,这是标准的纬度和经度.由于已经具有经度和纬度,因此只需按ALTER TABLE即可添加地理位置列,然后按UPDATE即可填充它.

geography::Point takes the latitude and longitude as well as an SRID (Special Reference ID number). In this case, the SRID is 4326 which is standard latitude and longitude. As you already have latitude and longitude, you can just ALTER TABLE to add the geography column then UPDATE to populate it.

我已经展示了两种从表中获取数据的方法,但是您不能使用此方法创建索引视图(索引视图不能具有自联接).尽管可以创建一个辅助表,该辅助表实际上是一个高速缓存,根据上面的内容进行填充.然后,您只需要担心维护它(可以通过触发器或其他过程来完成).

I've shown two ways to get the data out of the table, however you can't create an indexed view with this (indexed views can't have self-joins). You could though create a secondary table that is effectively a cache, that's populated based on the above. You then just have to worry about maintaining it (could be done through triggers or some other process).

请注意,交叉联接将为您提供25,000,000,000行,但是搜索很简单,因为您只需要查看places列之一(即SELECT * FROM table WHERE Place1 = 'Sheffield' AND distance < 100,第二个将为您提供更少的行,但是查询则需要同时考虑Place1和Place2列.

Note that the cross join will give you 250,000,000,000 rows, but searching is simple as you only need look at one of the places columns (i.e., SELECT * FROM table WHERE Place1 = 'Sheffield' AND distance < 100, the second will give you significantly less rows, but the query then needs to consider both the Place1 and Place2 column).

这篇关于庞大的SQL Server数据库进行距离计算的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆