给定日期范围的此查询的最快方法(最佳策略是什么) [英] Fastest way for this query (What is the best strategy) given a date range

查看:23
本文介绍了给定日期范围的此查询的最快方法(最佳策略是什么)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个表 A,它有一个 startDate 和一个 end dateDate 作为 2 个日期时间列,还有一些其他列.我有另一个表 B,它有一个日期时间列,称为日期列.这是在 SQL Server 2005 中.

I have a table A that has a startDate and an end dateDate as 2 datetime columns besides some more other columns. I have another table B that has one datetime column call it dates column. This is in SQL Server 2005.

这里的问题是:如何最好地设置索引等以获得以下内容:

Here the question: How to best set up the indexes etc to get the following:

select ....
 from A , B
where A.startDate >= B.dates
  and A.endDate < B.dates

两个表都有几千条记录.

Both tables have several thousand records.

推荐答案

更新:

有关使用计算列的查询的高效索引策略,请参阅我博客中的这篇文章:

See this article in my blog for efficient indexing strategy for your query using computed columns:

主要思想是我们只是为您的范围计算四舍五入的lengthstartDate,然后使用相等条件搜索它们(这对B-树索引)

The main idea is that we just compute rounded length and startDate for you ranges and then search for them using equality conditions (which are good for B-Tree indexes)

MySQLSQL Server 2008 中,您可以使用SPATIAL 索引(R-Tree).

In MySQL and in SQL Server 2008 you could use SPATIAL indexes (R-Tree).

它们特别适用于选择记录范围内给定点的所有记录"等条件,这正是您的情况.

They are particularly good for the conditions like "select all records with a given point inside the record's range", which is just your case.

您将 start_dateend_date 存储为 LineString 的开头和结尾(将它们转换为 UNIX 另一个数值的时间戳),用 SPATIAL 索引索引它们并搜索最小边界框 (MBR) 包含的所有此类 LineString有问题的日期值,使用 MBRContains.

You store the start_date and end_date as the beginning and the end of a LineString (converting them to UNIX timestamps of another numeric value), index them with a SPATIAL index and search for all such LineStrings whose minimum bounding box (MBR) contains the date value in question, using MBRContains.

请参阅我的博客中有关如何在 MySQL 中执行此操作的条目:

See this entry in my blog on how to do this in MySQL:

以及 SQL Server 的简要性能概述:

and a brief performance overview for SQL Server:

同样的解决方案可用于针对存储在数据库中的网络范围搜索给定的IP.

Same solution can be applied for searching a given IP against network ranges stored in the database.

此任务与您的查询一起是此类条件的另一个常用示例.

This task, along with you query, is another often used example of such a condition.

如果范围可以重叠,普通B-Tree索引就不好.

Plain B-Tree indexes are not good if the ranges can overlap.

如果他们不能(并且您知道),您可以使用 @AlexKuznetsov

If they cannot (and you know it), you can use the brilliant solution proposed by @AlexKuznetsov

另请注意,此查询性能完全取决于您的数据分布.

Also note that this query performance totally depends on your data distribution.

如果B中有很多记录,而A中有很少的记录,你可以在B.dates上建立一个索引,然后让A 上的 TS/CIS 去.

If you have lots of records in B and few records in A, you could just build an index on B.dates and let the TS/CIS on A go.

此查询将始终从 A 读取所有行,并将在嵌套循环中对 B.dates 使用 Index Seek.

This query will always read all rows from A and will use Index Seek on B.dates in a nested loop.

如果您的数据以其他方式分发,i.e.A 中有很多行,而 B 中的行很少,而且范围通常很短,那么您可以稍微重新设计一下表格:

If your data are distributed other way round, i. e. you have lots of rows in A but few in B, and the ranges are generally short, then you could redesign your tables a little:

A

start_date interval_length

,在A(interval_length, start_date)

并使用此查询:

SELECT  *
FROM    (
        SELECT  DISTINCT interval_length
        FROM    a
        ) ai
CROSS JOIN
        b
JOIN    a
ON      a.interval_length = ai.interval_length
        AND a.start_date BETWEEN b.date - ai.interval_length AND b.date

这篇关于给定日期范围的此查询的最快方法(最佳策略是什么)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆