自2008 R2版以来是否有任何Sql Server全文本搜索(FTS)性能改进? [英] Are there any Sql Server Full-Text Search (FTS) performance improvements since version 2008 R2?

查看:126
本文介绍了自2008 R2版以来是否有任何Sql Server全文本搜索(FTS)性能改进?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用SQL Server 2008 R2全文搜索功能来搜索260万条记录。搜索性能通常很差,它遵循通常报道的模式:冷系统/首次运行〜10 +秒,随后运行〜1-2秒。这与2013年2月在以下文章中报告的结果是一致的:

所以你认为你可以搜索 - 比较Microsoft SQL Server FTS和Apache Lucene



本文使用维基百科转储数据显示以下速度比较结果:

 
索引速度,大小和单个查询执行时间使用:

Lucene MS SQL FTS
索引速度3 MB /秒1 MB /秒
索引大小10 -25%25-30%
简单查询<20毫秒<20毫秒
自定义分数查询<4秒> 20秒



 
并行查询执行(10个线程,每个查询的平均执行时间,单位为ms):

MS SQL FTS Lucene (文件系统)Lucene(RAM)
冷系统:简单查询56 643 21
提升查询19669 * 859 27
第二次执行:简单查询14 8 <5
提升查询465 17 9

*的平均时间,第一个查询最多可以执行2分钟(!)

我的问题是:


  1. 由于自2013年2月8日该文章发布以来,有几个主要的SQL Server版本,有人可以报告任何FTS相对于同一数据的性能改进(最好是100万条记录)?

  2. 更多最近的SQL Server版本支持FTS目录/索引与solr / lucene一样放在RAM中? b $ b 在我们的场景中,我们很少将新数据插入到FT目录链接表中,但r不常读取只读搜索。所以,我不认为SQL会不断重建FTS索引是个问题。

    解决方案

全文搜索SQL Server 2012中的改进


我们查看了整个代码库,从查询在等待正在进行的索引更新以释放共享模式锁,从索引片段填充期间分配多少内存到如何重新组织查询代码库作为流式表值函数以优化TOP N搜索查询,我们如何维护密钥分布直方图以在并行线程上执行搜索,全部我们如何才能更好地利用处理器计算指令(例如评分等级)...最终的结果是,我们能够显着提升性能(在许多情况下,对于并发索引更新与大量查询工作量)和规模,而不必更改任何存储结构或现有的API表面。我们所有从SQL 2008 / R2到Denali的客户都将从中受益。



we're using SQL Server 2008 R2 Full-Text Search over a table with 2.6 million records. The search performance often is poor, it follows the commonly reported pattern: cold system/first run ~10+ sec, subsequent runs ~1-2 sec. This is inline with results reported in the following article dated of Feb, 2013:

So You Think You Can Search – Comparing Microsoft SQL Server FTS and Apache Lucene

The article shows the following speed comparison results using Wikipedia dump data:

Indexing speed, size and single query execution time using:

                        Lucene      MS SQL FTS
Indexing Speed          3 MB/sec    1 MB/sec
Index Size              10-25%      25-30%
Simple query            < 20 ms     < 20 ms
Query With Custom Score < 4 sec     > 20 sec

Parallel Query Executions (10 threads, average execution time per query in ms):

                                     MS SQL FTS  Lucene (File System)   Lucene (RAM)
Cold System:         Simple Query    56          643                    21
                     Boost Query     19669*      859                    27
Second executions:   Simple Query    14          8                      < 5
                     Boost Query     465         17                     9

*average time, the very first query could be executed up to 2 min(!)

My questions are:

  1. Since there were several major SQL Server releases since the article was published on Feb 8, 2013, can someone report any FTS performance improvements over same data (preferably of 1+ million records) when they migrated to more recent SQL Server versions (2012, 2014 and 2016)?

  2. Do more recent SQL Server versions support FTS catalogs/indexes placed in RAM just as solr/lucene do?

UPDATE: in our scenario we seldom insert new data into FT catalog linked table, but run read only searches very often. So, I don't think SQL constantly rebuilding FTS index is the issue.

解决方案

Fulltext Search Improvements in SQL Server 2012:

We looked at the entire code base from how queries block while waiting an ongoing index update to release a shared schema lock, from how much memory is allocated during index fragment population, to how we could reorganize the query code base as a streaming Table Value Function to optimize for TOP N search queries, how we could maintain key distribution histograms to execute search on parallel threads, all the way to how we could take better advantage of the processor compute instructions (scoring ranks for example)… End result is that we are able to significantly boost performance (10X in many cases when it comes to concurrent index updates with large query workloads) and scale without having to change any storage structures or existing API surface. All our customers going from SQL 2008 / R2 to Denali will benefit with this improvement.

这篇关于自2008 R2版以来是否有任何Sql Server全文本搜索(FTS)性能改进?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆