有什么方法可以在SQL Server全文索引中搜索前导零的数字 [英] Is there any way to search for numbers with leading zeros in SQL Server Full-text indexes

查看:148
本文介绍了有什么方法可以在SQL Server全文索引中搜索前导零的数字的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们有一个带有Body(NVARCHAR(MAX))列的表格,其中包含来自电子邮件和文件的文本。该列是全文索引的。

有些文档包含参考数字,例如00123.然而,全文引擎似乎去掉了前导零,所以当我们使用 CONTAINS Body,'00123')它也返回包含123的误报。



有没有办法解决这个问题?理想情况下,有一种方法可以在查询中解决这个问题,但我们也会考虑其他选项,例如替代字词破解程序等。

我们正在使用SQL Server 2008 R2和稍后。

根据SS 2012的行为更改为全文搜索页面,该词语是以前的版本,当给出术语 022 ,产生 022 nn022 ,但新版本产生 022 nn22 。因此,SQL Server 2008 R2将在搜索具有前导零的数字时产生期望的结果,但SQL Server 2012不会。 (这里假定列是全文索引,它们使用英语作为他们的词语打破语言)。



有几种方法可以实现SQL的预期结果Server 2012.您既可以恢复到以前的单词破解程序,也可以使用自定义字典。如果您的数量有限,请考虑使用自定义字典。



自定义字典在创建自定义词典在SQL Server 2008全文索引中按原样编制索引的特殊术语字典使用自定义字典自定义Word Breaker的行为
注意:第一篇文章说英语的十六进制语言代码是1033,但1033是英语的LCID。英语的十六进制语言代码是0009.因此,对于英文字典,文件名应该是Custom0009.lex。


We have table with a Body (NVARCHAR(MAX)) column that contains text from emails and files. The column is full-text indexed.

Some of the documents contain reference numbers such as 00123. However the full-text engine seems to strip leading zeros so when we search using CONTAINS(Body, '00123') it also returns false positives containing just 123.

Is there anyway to fix this? Ideally there would be a way to address this in the query, but we would also consider other options such as alternative word breakers etc.

We are using SQL Server 2008 R2 and later.

解决方案

According to SS 2012's Behavior Changes to Full-Text Search page, the previous version of the word breakers, when given the term 022, produced 022 and nn022, but the new version produces 022 and nn22. So SQL Server 2008 R2 will produce the desired result when searching for numbers with leading zeros but SQL Server 2012 will not. (This assumes the columns to be full-text indexed are using English as their language for word breaking).

There are a couple of ways to achieve the desired outcome on SQL Server 2012. You can either revert to the previous word breakers or, if you have a limited number of terms that you are a looking for, consider using a custom dictionary.

Custom dictionaries are described in Creating Custom Dictionaries for special terms to be indexed 'as-is' in SQL Server 2008 Full-Text Indexes and Customize the Behavior of Word Breakers with a Custom Dictionary. Note: The first article says that the language hex code for English is 1033, but 1033 is the LCID for English. The language hex code for English is 0009. So for an English dictionary the filename should be Custom0009.lex.

这篇关于有什么方法可以在SQL Server全文索引中搜索前导零的数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆