全文不索引varbinary列(带有html) [英] Full Text not indexing varbinary column (with html)

查看:89
本文介绍了全文不索引varbinary列(带有html)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含HTML数据的表,我想通过html过滤器使用全文索引进行搜索

所以我创建了一个索引:

CREATE FULLTEXT CATALOG myCatalog AS DEFAULT

CREATE FULLTEXT INDEX ON myTable (Body TYPE COLUMN Filetype)
                        KEY INDEX PK_myTable

Body是带有HTML的varbinary(max)列. Filetype列是返回.html的计算列.

没有返回结果.

我确认已安装.html过滤器.如果将列转换为nvarchar并仅创建纯文本"索引(不是html),则FullText索引也已正确安装,并且工作正常.

SQL日志或FTS日志中没有错误.

关键字表为空!

SELECT *
FROM sys.dm_fts_index_keywords
(DB_ID('myDatabase'), OBJECT_ID('myTable'))

它返回的只是文件结束"符号.

显示文档数35",表示已处理文档,但未提取任何关键字.

PS.我有SQL Server Express 版(具有所有高级功能,包括全文).这可能是原因吗?但是,纯文本"全文搜索还是可以的!

PPS.让我的同事在SQL Express 2016上进行测试-相同的结果...在我们的生产服务器"Enterprise"版本上尝试过-相同.

更新

确定,全文索引不支持UNICODE!在varbinary列中.当我将列转换为非unicode时(通过将其转换为nvarchar然后转换为varchar然后又返回到varbinary),它开始工作.

有人知道任何解决方法吗?

解决方案

确定,因此事实证明全文索引确实支持varbinary中的unicode数据,但请注意:

如果您的varbinary列是从Nvarchar创建的,请确保在开头添加0xFFFE unicode签名

例如,我将计算列用于全文索引,因此我不得不将计算列更改为:

alter table myTable
add FTS_Body as 0xFFFE + (CAST(HtmlBody as VARBINARY(MAX)))
--HtmlBody is my nvarchar column that contains html

I have a table with HTML data, that I want to search using the Full Text Index via an html-filter

So I created an index:

CREATE FULLTEXT CATALOG myCatalog AS DEFAULT

CREATE FULLTEXT INDEX ON myTable (Body TYPE COLUMN Filetype)
                        KEY INDEX PK_myTable

Body is a varbinary(max) column with HTML. The Filetype column is a computed column returns .html.

No results are being returned.

I verified that .html filter is installed. FullText index is also installed properly and works fine if I convert the column to nvarchar and create just a "plain text" index (not html).

No errors in the SQL log or FTS log.

The keywords table is just empty!

SELECT *
FROM sys.dm_fts_index_keywords
(DB_ID('myDatabase'), OBJECT_ID('myTable'))

All it returns is "END OF FILE" symbol.

It says "document count 35" which mean the documents were processed, but no keywords were extracted.

PS. I have SQL Server Express Edition 2012 (with all advanced features including full text). Can this be the reason? But again, the "plain" full text search works just fine!

PPS. Asked my coworker to test this on SQL Express 2016 - same result... Tried on our production server "Enterprise" edition - same.

UPDATE

OK it turns out the full text index DOES NOT SUPPORT UNICODE!! in varbinary columns. When I converted the column to non-unicode (by converting it to nvarchar then to varchar and then back to varbinary) It started working.

Anyone knows any workarounds?

解决方案

OK, so it turns out fulltext index DOES support unicode data in varbinary but pay attention to this:

If your varbinary column is created from Nvarchar be sure to include the 0xFFFE unicode signature at the beginning

For example, I'm using a computed column for full text index, so I had to change my computed column to this:

alter table myTable
add FTS_Body as 0xFFFE + (CAST(HtmlBody as VARBINARY(MAX)))
--HtmlBody is my nvarchar column that contains html

这篇关于全文不索引varbinary列(带有html)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆