Sql serve全文搜索Containstable在JOIN中使用时非常慢! [英] Sql serve Full Text Search with Containstable is very slow when Used in JOIN!

查看:91
本文介绍了Sql serve全文搜索Containstable在JOIN中使用时非常慢!的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用sql 2008全文搜索功能,而且根据我使用Contains或ContainsTable的方式,我遇到了严重的性能问题。



下面是示例:(表1有大约5000条记录,并且table1上有覆盖索引,其中包含where子句中的所有字段。所以请原谅我,如果有语法问题。)



场景1:

  select * from table1 as t1 
where t1.field1 = 90
and t1.field2 ='something'
和Exists(从containsstable中选择top 1 * (table1,*,'something')as t2
where t2。[key] = t1.id)

结果:1​​0秒(非常缓慢)

情景2:

 从table1中选择*作为t1 
在t2上加入containstable(table1,*,'something')作为t2。[key] = t1.id
where t1.field1 = 90
and t1.field2 ='something'

结果:1​​0秒(非常缓慢)

情景3:

 声明@tbl表(id uniqueidentifier主键)
插入@tbl从containstable(table1,*,'something')中选择{key]

从table1中选择*作为t1
其中t1.field1 = 90
和t1.field2 ='something'
和Exists(从@tbl中选择id作为tbl,其中id = req1.id)

结果:小数秒(超级快)

底线,看起来如果我在任何类型的连接中使用Containstable或where语句条件的where条件也有其他条件,性能非常糟糕。另外,如果您查看剖析器,则从数据库读取的数量将进入屋顶。但是,如果我首先进行全文搜索并将结果放入一个表格变量中并使用该变量,则所有内容都变得非常快。读取次数也低得多。它看起来在坏的情况下,不知何故它被卡在一个循环,导致它从数据库中读取很多次,但当然我不明白为什么。



现在的问题首先是为什么发生这种情况?问题二是如何扩展表变量?如果结果是成千上万的记录呢?它仍然会很快。



任何想法?
谢谢

解决方案

我在这个问题上花费了相当长的时间,并且基于运行很多场景, out:

如果您的查询中有任何位置包含Contains或ContainsTable,那就是首先执行并且相互独立执行的部分。这意味着,即使其他条件限制您的搜索只有一条记录,既不包含也不包含关于这一点的稳定关心。所以这就像一个平行执行。

现在,全文搜索只返回一个Key字段,它立即寻找Key作为为查询选择的其他索引的第一个字段。因此,对于上面的示例,它使用[key],field1,field2查找索引。问题在于它会根据where子句中的字段为其余查询选择一个索引。所以对于上面的例子,它选择了我有的那个像field1,field2,Id那样的覆盖索引。 (表格的ID与从全文搜索返回的[Key]相同)。总结如下:
$ b


  1. 执行containstable

  2. 执行查询的其余部分并选择一个基于索引的索引关于查询的where子句

  3. 它试图合并这两个。因此,如果它为查询的其余部分选择的索引以[key]字段开头,那就没有问题。但是,如果索引没有[key]字段作为第一个键,它将开始执行循环。它甚至不进行表扫描,否则通过5000条记录不会那么慢。循环的方式是运行FTS结果总数乘以查询其余部分结果总数的循环。因此,如果FTS返回2000条记录,并且查询的其余部分返回3000,则循环2000 * 3000 = 6,000,000。我不明白为什么。

因此,在我的情况下,它执行全文搜索,然后查询休息,覆盖指数,我有哪些是基于field1,field2,id(这是错误的),结果它拧了。如果我将覆盖索引更改为Id,field1,field2,则一切都会非常快。



我期待的是FTS返回一堆[key],其余的查询返回一堆[Id],然后Id应该与[key]匹配。

当然,我试图在这里简化我的查询,但实际的查询要复杂得多,我不能只改变索引。我也有一些情景,其中全文传递的文本是空白的,在这些场景中,我甚至不想加入到containstable中。
在这些情况下,将我的覆盖索引更改为id字段作为第一个字段,将会产生灾难。



无论如何,现在我选择了临时表解决方案因为它为我工作。我还将结果限制为几千个,这样可以在记录数量过高时帮助解决表变量的潜在性能问题。



谢谢


I am using sql 2008 full text search and I am having serious issues with performance depending on how I use Contains or ContainsTable.

Here are sample: (table one has about 5000 records and there is a covered index on table1 which has all the fields in the where clause. I tried to simplify the statements so forgive me if there is syntax issues.)

Scenario 1:

select * from table1 as t1
where t1.field1=90
and   t1.field2='something'
and   Exists(select top 1 * from containstable(table1,*, 'something') as t2 
where t2.[key]=t1.id)

results: 10 second (very slow)

Scenario 2:

select * from table1 as t1
join containstable(table1,*, 'something') as t2 on t2.[key] = t1.id
where t1.field1=90
and   t1.field2='something'

results: 10 second (very slow)

Scenario 3:

Declare @tbl Table(id uniqueidentifier primary key)
insert into @tbl select {key] from containstable(table1,*, 'something')

select * from table1 as t1
where t1.field1=90
and   t1.field2='something'
and  Exists(select id from @tbl as tbl where id=req1.id)

results: fraction of a second (super fast)

Bottom line, it seems if I use Containstable in any kind of join or where clause condition of a select statement that also has other conditions, the performance is really bad. In addition if you look at profiler, the number of reads from the database goes to the roof. But if I first do the full text search and put results in a table variable and use that variable everything goes super fast. The number of reads are also much lower. It seems in "bad" scenarios, somehow it gets stuck in a loop which causes it to read many times from teh database but of course I don't understant why.

Now the question is first of all whyis that happening? and question two is that how scalable table variables are? what if it results to 10s of thousands of records? is it still going to be fast.

Any ideas? Thanks

解决方案

I spent quite sometime on this issue, and based on running many scenarios, this is what I figured out:

if you have Contains or ContainsTable anywhere in your query, that is the part that gets executed first and rather independently. Meaning that even if the rest of the conditions limit your search to only one record, neither Contains nor containstable care about that. So this is like a parallel execution.

Now since fulltext search only returns a Key field, it immediately looks for the Key as the first field of other indexes chosen for the query. So for the example above, it looks for the index with [key], field1, field2. The problem is that it chooses an index for the rest of query based on the fields in the where clause. so for the example above it picks the covered index that I have which is something like field1, field2, Id. (Id of the table is the same as the [Key] returned from the full text search). So summary is:

  1. executes containstable
  2. executes the rest of the query and pick an index based on where clause of the query
  3. It tries to merge these two. Therefore, if the index that it picked for the rest of the query starts with the [key] field, it is fine. However, if the index doesn't have the [key] field as the first key, it starts doing loops. It does not even do a table scan, otherwise going through 5000 records would not be that slow. The way it does the loop is that it runs the loop for the total number of results from FTS multiplied by total number of results from the rest of the query. So if the FTS is returning 2000 records and the rest of the query returns 3000, it loops 2000*3000= 6,000,000. I donot understand why.

So in my case it does the full text search, then it does he rest of the query but picks the covered index that I have which is based on field1, field2,id (which is wrong) and as the result it screws up. If I change my covered index to Id, field1, field2 everything would be very fast.

My expection was that FTS returns bunch of [key], the rest of the query return bunch of [Id] and then the Id should be matched against [key].

Of course, I tried to simplify my query here, but the actual query is much more complicated and I cannot just change the index. I also do have scenarios where the text passed in full text is blank and in those scenarios I donot even want to join with containstable. In those cases changing my covered index to have the id field as the first field, will generate disaster.

Anyways, for now I chose the temp table solution since it is working for me. I am also limiting the result to a few thousand which helps with the potential performance issues of table variables when the number of records go too high.

thanks

这篇关于Sql serve全文搜索Containstable在JOIN中使用时非常慢!的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆