使SQL Server索引小数 [英] Make SQL Server index small numbers

查看:148
本文介绍了使SQL Server索引小数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们在一个项目中使用SQL Server 2005。系统的用户可以使用关键字搜索某些对象。我们实现这一目标的方式是为每个表中可能包含这些'关键字'的重要列创建一个全文目录,然后使用CONTAINS搜索用户在该索引中的搜索框中输入的关键字。例如,假设你有Movie对象,并且你想让用户在文章的标题和正文中搜索关键词, d索引Title和Plot列,然后执行如下操作:

  SELECT * FROM Movies WHERE CONTAINS(标题,关键字) OR CONTAINS(Plot,关键字)

(实际上它比这更先进一些,但没有什么非常复杂的)

有些用户正在为他们的搜索添加数字,所以他们想要找到'Terminator 2'。这里的问题是,据我所知,默认情况下,SQL Server不会索引简短的单词,从而进行如下搜索:

  SELECT * FROM Movies WHERE CONTAINS(Title,'Terminator 2'')

实际上相当于这样做:

  SELECT * FROM Movies WHERE CONTAINS(Title,'Terminator')<  -  notice缺少'2'

,我们得到的结果太多了。



有没有办法强制SQL Server索引小字?最好是,我宁愿只索引数字,如1,2,21等。我不知道在哪里定义索引标准,或者甚至可能如此具体。 / p>




好的,我这样做了,从列表中删除了噪音词,现在行为有点不同了,但仍然不是你所期望的。



搜索将不会用于终结者2(我只是做这件事,如果我透露我们正在做的事情,我的雇主可能不会真正开心。无论如何,这些术语有点不同,但原理相同),我没有得到任何东西,但我知道有些对象包含这两个词。



也许我做错了什么?我从ENG,ENU和NEU(中性)的噪声配置中删除了所有数字1 ... 9,重新生成了索引,并尝试了搜索。 解决方案

这些小词被全文索引视为噪音词。您可以自定义噪音词汇列表。这博客文章提供了更多细节。您需要在更改干扰词文件时重新填充全文索引。


We're using SQL Server 2005 in a project. The users of the system have the ability to search some objects by using 'keywords'. The way we implement this is by creating a full-text catalog for the significant columns in each table that may contain these 'keywords' and then using CONTAINS to search for the keywords the user inputs in the search box in that index.

So, for example, let say you have the Movie object, and you want to let the user search for keywords in the title and body of the article, then we'd index both the Title and Plot column, and then do something like:

SELECT * FROM Movies WHERE CONTAINS(Title, keywords) OR CONTAINS(Plot, keywords)

(It's actually a bit more advanced than that, but nothing terribly complex)

Some users are adding numbers to their search, so for example they want to find 'Terminator 2'. The problem here is that, as far as I know, by default SQL Server won't index short words, thus doing a search like this:

SELECT * FROM Movies WHERE CONTAINS(Title, '"Terminator 2"')

is actually equivalent to doing this:

SELECT * FROM Movies WHERE CONTAINS(Title, '"Terminator"') <-- notice the missing '2'

and we are getting a plethora of spurious results.

Is there a way to force SQL Server to index small words? Preferably, I'd rather index only numbers like 1, 2, 21, etc. I don't know where to define the indexing criteria, or even if it's possible to be as specific as that.


Well, I did that, removed the "noise-words" from the list, and now the behaviour is a bit different, but still not what you'd expect.

A search won't for "Terminator 2" (I'm just making this up, my employer might not be really happy if I disclose what we are doing... anyway, the terms are a bit different but the principle the same), I don't get anything, but I know there are objects containing the two words.

Maybe I'm doing something wrong? I removed all numbers 1 ... 9 from my noise configuration for ENG, ENU and NEU (neutral), regenerated the indexes, and tried the search.

解决方案

These "small words" are considered "noise words" by the full text index. You can customize the list of noise words. This blog post provides more details. You need to repopulate your full text index when you change the noise words file.

这篇关于使SQL Server索引小数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆