在涉及'日期范围'的大表中进行查询的推荐索引和'订单ID' [英] Recommended indexes for query in large table involving a 'date range' and an 'order id'

查看:55
本文介绍了在涉及'日期范围'的大表中进行查询的推荐索引和'订单ID'的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个查询(由LINQ to SQL创建),以获取在某个日期范围内导致订单(orderid不为null)之间进行的站点访问"列表.

I have a query (which was created by LINQ to SQL) to get me a list of 'site visits' that were made between a certain date range which resulted in an order (orderid is not null).

查询没有问题.我只需要为它创建正确索引的建议.我在生产现场尝试不同的组合,设法搞砸了东西,使外键断开了.经过一阵恐慌后,我修复了该问题-但认为我现在应该先寻求建议,然后再创建索引.

Theres nothing wrong with the query. I just need advice on creating the correct index for it. I was playing around trying different combinations on a production site and managed to screw things up such that a foreign key got disconnected. I fixed that after some panic - but thought I'd ask for advice now before recreating an index.

表即将接近一百万行,我需要索引来帮助我.该查询仅用于报告,因此不必非常快,而不必延迟其他用户的查询(正在执行).

The table is getting close to a million rows and I need the indexes to help me out here. This query is only used for reporting so doesnt have to be extremely fast, just not delay other user's queries (which it is doing).

SELECT TOP 1000
  t0.SiteVisitId, t0.OrderId, t0.Date, 
  t1.Domain, t0.Referer, t0.CampaignId
FROM 
  SiteVisit AS t0
  LEFT OUTER JOIN KnownReferer AS t1 ON t1.KnownRefererId = t0.KnownRefererId
WHERE
  t0.Date <= @p0 
  AND t0.Date >= @p1
  AND t0.OrderId IS NOT NULL
ORDER BY 
  t0.Date DESC

@p0='2008-11-1 23:59:59:000', @p1='2008-10-1 00:00:00:000'

我目前在我的身份整数列SiteVisitId上具有聚集索引.

I currently have a clustered index on SiteVisitId, which is my identity integer column.

我不知道以下哪项最有效率:

I dont know which of the following are most likely to be most efficient:

  • Date上创建索引
  • Date上创建索引,并在OrderId
  • 上创建单独的索引
  • DateOrderId
  • 上创建一个多列"索引
  • 其他组合吗?
  • Create an index on Date
  • Create an index on Date AND a separate index on OrderId
  • Create a 'multicolumn' index on Date AND OrderId
  • Some other combination?

我还想知道是否应该为hasOrder创建单独的位列,而不是检查OrderId IS NOT NULL是否更有效.

I am also wondering whether I should create a separate bit column for hasOrder instead of checking if OrderId IS NOT NULL if that might be more efficient.

仅供参考:KnownReferer只是一个表,其中包含100个左右的HttpReferers的列表,因此我可以轻松查看来自google,yahoo等网站的点击次数.

FYI: The KnownReferer is just a table which contains a list of 100 or so known HttpReferers so i can easily see how many hits from google, yahoo etc.

推荐答案

在典型的日期范围之间,您希望有几行?您通常一次要看一个月吗?

How many rows do you expect to have between a typical date range? Are you typically looking at a month at a time?

我将从在[Date]列上的索引开始.如果对于典型查询而言,结果行数很少,则无需在索引中添加[OrderId]列.

I would start out with an index over the [Date] column. If, for a typical query, your resulting row count is small you shouldn't need to add the [OrderId] column to your index.

另一方面,如果通常一个月中有很多行,则可以将[OrderId]列添加到索引中,尽管由于它被视为布尔值,所以可能不会给您带来很多好处.它取决于NULLNOT NULL的行数.如果给定月份有很多行,但是只有少数几个行具有有效的[OrderId],那么索引可能会提高性能.

On the other hand, if you have a large amount of rows in a typical month, then you can add the [OrderId] column to the index, though since it's being treated as a boolean value, it may not buy you much. It depends on how many rows are NULL vs NOT NULL. If you have you a lot of rows for a given month, but only a few have a valid [OrderId] then the index would probably improve performance.

阅读此相关问题中可接受的答案,并确定是否值得在附加列上建立索引:

Read the accepted answer in this related question and determine if it's worth indexing over the additional column:

我应该在SQL Server中索引位字段吗? /a>

Should I index a bit field in SQL Server?

当然,还要测试使用和不使用索引的索引和生成的计划.

And of course, test the indexes and the plans generated with and without the index.

更新:其他一些答案指定了更具侵略性的索引,这将提高此查询的性能,但可能会对表上的其他操作产生不利影响.例如,建议的覆盖索引将允许SQL Server处理该查询,而对实际表的影响很小,但是当其他查询写入实际表时,可能会引起问题(因为SQL Server将需要同时更新表和覆盖索引)这种情况).

Update: Some of the other answers specify a more aggressive index, which should improve performance of this query, but may adversely affect other operations on the table. For example, the covering index suggested will allow SQL Server to process this query with little impact to the actual table, but may cause problems when other queries write to the actual table (since SQL Server will need to update both the table and covering index in that case).

由于这是一个报表查询,因此我将尽可能地对其进行优化.如果此查询运行时间过长,导致其他更关键的查询运行缓慢或超时,我将仅对该查询进行充分优化以减少其对其他查询的影响.

Because this is a reporting query, I would optimize it as little as possible. If this query is running long, causing other, more critical, queries to run slowly or timeout, I would only optimize this query enough to reduce it's affect on those other queries.

但是,如果您希望该表继续增长,我会考虑采用单独的报告模式并定期从该表中提取数据.

Though, if you expect this table to continue growing I would consider a separate reporting schema and periodically extract data from this table.

这篇关于在涉及&amp;#39;日期范围&amp;#39;的大表中进行查询的推荐索引和&amp;#39;订单ID&amp;#39;的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆