sys.dm_fts_parser sql全文 [英] sys.dm_fts_parser sql full text

查看:114
本文介绍了sys.dm_fts_parser sql全文的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们很难找出给sys.dm_fts_parser的两个类似的字符串给出不同的结果

  select * from sys.dm_fts_parser(''0 CAD'',0,null,0)

似乎认为0 CAD是一个令牌(返回2令牌)

  select * from sys.dm_fts_parser(''0 cad'', 0,null,0)

返回3个标记 - 正确

更重要,甚至更令人困惑的是为什么

select * from表其中包含(*,point 5 CAD) works和
select * from表where contains(*,point 5 cad) failed



查找的列包含point 5 CAD -



不应该全文索引构建器忽略噪音词(例如5)基于索引设置或包含它。

我们已经尝试过并且不能解释为什么nnnn CAD是特别的东西


注意到根据 http://msdn.microsoft.com/tf全文假设为不区分大小写.com / en-us / library / ms142583.aspx



我错过了什么?

编辑:使用SQL 2012 11.0.2218

解决方案

使用SQL 2008时

  select * from sys.dm_fts_parser(''0 CAD'',0,null ,0) - 给出2个标记
select * from sys.dm_fts_parser(''0 CAD'',1033,null,0) - 给出3个标记

$ b

在SQL 2012(11.0.3218)上:
$ b

  select * from sys.dm_fts_parser(''0 CAD'',1033,null,0) - 给出2个标记

在SQL 2012中,Microsoft推出了新的分词器(版本14.0.4763.1000) http://msdn.microsoft.com/en-us/library/gg509108.aspx



现在看来,这个工作断路器识别3个字符的ISO 4217货币代码,如果在3个字符代码之前有一个数字,则不会被分解。


We having a really hard time to figure out two similar strings given to sys.dm_fts_parser gives different results

select * from sys.dm_fts_parser('"0 CAD"', 0, null, 0) 

seems to think that "0 CAD" is one token (returns 2 token)

select * from sys.dm_fts_parser('"0 cad"', 0, null, 0) 

returns 3 tokens - correctly

more importantly and even more confusing is why

select * from Table where contains(*,"point 5 CAD") works and select * from Table where contains(*,"point 5 cad") fails

where the column searched contains "point 5 CAD" -

Shouldn't the full text index builder either ignore noise words (e.g. "5") based upon the index setting or include it.
We have tried both and cant explain why "nnnn CAD" is something special

note that full text is suppose to be case-insensitive according to http://msdn.microsoft.com/en-us/library/ms142583.aspx

What am I missing?

Edit: Using SQL 2012 11.0.2218

解决方案

When using SQL 2008

select * from sys.dm_fts_parser('"0 CAD"', 0, null, 0) - gives 2 tokens   
select * from sys.dm_fts_parser('"0 CAD"', 1033, null, 0) - gives 3 tokens   

On SQL 2012 (11.0.3218):

select * from sys.dm_fts_parser('"0 CAD"', 1033, null, 0) - gives 2 tokens

In SQL 2012 Microsoft introduced a new word breaker (version 14.0.4763.1000) http://msdn.microsoft.com/en-us/library/gg509108.aspx

It seems that the work-breaker now recognizes 3 character ISO 4217 Currency Codes, and if there is a number prior to the 3 char code it is not broken up.

这篇关于sys.dm_fts_parser sql全文的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆