SQL Server CONTAINSTABLE不适用于单个数字的数字 [英] SQL Server CONTAINSTABLE not working for single digit numbers

查看:526
本文介绍了SQL Server CONTAINSTABLE不适用于单个数字的数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这个问题是关于



请注意,搜索字词 1 被视为噪音。这是问题。然后运行此查询帮助我找到所有那些喧哗的话语和足够肯定的数字 0-9 都在那里:

  SELECT ssw。* ,ssw.stopword,slg.name 
FROM sys.fulltext_system_stopwords ssw
JOIN sys.fulltext_languages slg
ON slg.lcid = ssw.language_id
WHERE slg.lcid = 1033 - English

解决方案

一个解决方案是从噪音词中删除单个数字的数字。但我无法找到如何做到这一点。其实,在我的情况下,这不会是想法,因为我的系统的用户只会搜索地址,所以如果他们键入 ,我不希望系统将其视为噪音,因为他们可能正在搜索以开头的街道



我使用下面的查询完全删除了停用列表现在一切都按预期工作:

  ALTER FULLTEXT INDEX ON [地址] SET STOPLIST =关

希望这有助于其他人。

This question is about SQL Server's FTS ContainsTable.

To replicate the issue, we can use the script below which will create one table and fill it with addresses.

CREATE TABLE Address (FullAddress nvarchar(100) NOT NULL);  
CREATE UNIQUE CLUSTERED INDEX AddressKey ON Address(FullAddress);  
INSERT INTO Address VALUES ('1 OLD YONGE ST, AURORA, ON');  
INSERT INTO Address VALUES ('1 OLD YONGE ST, NORTH YORK, ON');
INSERT INTO Address VALUES ('1 YONGE ST N UNIT 1, HUNTSVILLE, ON');
INSERT INTO Address VALUES ('1 YONGE ST N UNIT 10, HUNTSVILLE, ON');
INSERT INTO Address VALUES ('18 YONGE ST UNIT 324, TORONTO, ON');
INSERT INTO Address VALUES ('10415 YONGE ST UNIT 1, RICHMOND HILL, ON');
INSERT INTO Address VALUES ('11211 YONGE ST UNIT 37 BUILDING A, RICHMOND HILL, ON');

Now we will create the fulltext catalog and create an index on it.

CREATE FULLTEXT CATALOG AddressCat;  
CREATE FULLTEXT INDEX ON Address(FullAddress) KEY INDEX AddressKey ON AddressCat; 

Issue

If we run a query and search for addresses that start with 1 (Notice this is a single digit) and the 1 is NEAR the next term which is Yong, we expect it to return all the first 4 records above. Here is the query:

SELECT * FROM CONTAINSTABLE (Address, FullAddress, '"1" NEAR "Yon*"') ORDER BY RANK DESC;

However, it returns no rows. This is the issue.

But what if we execute a query with double digits such as 11 or 10, then it will return records as expected.

Question:

Why will ContainsTable NOT return any results for single digit searches?

解决方案

Finding the cause of the issue

I tried many things such as changing the query to the:

SELECT * FROM CONTAINSTABLE (Address, FullAddress, 'NEAR((1, YONGE), 5, TRUE)') 
-- or this
SELECT * FROM CONTAINSTABLE (Address, FullAddress, '1 YON*')

but without any luck.

After some searching online, I started thinking (since the issue only happens with single digits) that it may have something to do with Stopwords:

Stopwords. A stopword can be a word with meaning in a specific language. For example, in the English language, words such as "a," "and," "is," and "the" are left out of the full-text index since they are known to be useless to a search. A stopword can also be a token that does not have linguistic meaning.

Then with the help of this SO Answer, I was able to figure out how SQL Server was interpreting my search. Here is the query and the result of the query:

select * from sys.dm_fts_parser('"1" NEAR "Yon*"',2057, 0, 0)

Notice how the search term 1 is treated as Noise. This was the issue. Then running this query helped me find all the noise words and sure enough the numbers 0-9 were all there:

SELECT ssw.*, ssw.stopword, slg.name
      FROM sys.fulltext_system_stopwords ssw
      JOIN sys.fulltext_languages slg
      ON slg.lcid = ssw.language_id
      WHERE slg.lcid = 1033 -- English

Solution

One solution would be to remove the single digit numbers from the noise words. But I could not find how to do that. Actually, in my case that will not be idea anyhow since the users of my system will only be searching for addresses so if they type is or the, I do not want the system to treat it as noise since they may be searching for a street which starts with is.

I removed the stoplist altogether using the query below and now everything works as expected:

ALTER FULLTEXT INDEX ON [Address] SET STOPLIST = off

Hopefully this helps someone else.

这篇关于SQL Server CONTAINSTABLE不适用于单个数字的数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆