全文索引,不从pdf文件流返回任何结果 [英] fulltext index returning no results from pdf filestream

查看:165
本文介绍了全文索引,不从pdf文件流返回任何结果的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个文件流表,该文件流表在Windows 8.1 x64计算机上的SQL Server 2012上运行,该表已经存储了一些PDF和TXT文件,因此,我决定使用以下命令创建全文索引来搜索这些文件:

I have a filestream table running on SQL Server 2012 on a Windows 8.1 x64 machine, which already have a few PDF and TXT files stored, so I decided to create a fulltext index to search through these files by using the following command:

CREATE FULLTEXT CATALOG FileStreamFTSCatalog AS DEFAULT;

CREATE FULLTEXT INDEX ON storage
(FileName Language 1046, File TYPE COLUMN FileExtension Language 1046)
KEY INDEX PK__storage__3214EC077DADCE3C
ON FileStreamFTSCatalog
WITH CHANGE_TRACKING AUTO;

然后,我在阅读了一些与我有相同问题的人后发送了这些命令:

Then I sent these commands after reading some people having the same problem as me:

EXEC sp_fulltext_service @action='load_os_resources', @value=1;
EXEC sp_fulltext_service 'verify_signature', 0;
EXEC sp_fulltext_service 'update_languages';
Exec sp_fulltext_service 'ft_timeout', 600000;
Exec sp_fulltext_service 'ism_size',@value=16;
EXEC sp_fulltext_service 'restart_all_fdhosts';
EXEC sp_help_fulltext_system_components 'filter';
reconfigure with override

我可以看到已配置PDF IFilter

I can see the PDF IFilter configured

filter  .pdf    E8978DA6-047F-4E3D-9C78-CDBE46041603    C:\Program Files\Adobe\Adobe PDF iFilter 11 for 64-bit platforms\bin\PDFFilter.dll  11.0.1.36   Adobe Systems, Inc.

我什至可以做一个

select * from storage
where contains(*, 'data')

但是它只返回被索引的TXT文件,所以我想知道:我还需要做其他事情来开始为PDF编制索引吗?还是有必要创建另一个表并重新插入我已经存储的所有这些PDF,即使TXT文件的索引刚刚确定?

but it's returning only the TXT files indexed, so I'm wondering: is there anything else I need to do to start indexing my PDFs? Or is it necessary to create another table and reinsert all these PDFs which I already had stored, even though the TXT files are getting indexed justfined?

更新1:

打开SQLFTXXX.LOG,我收到此消息(对于FileTable):

Opening the SQLFTXXX.LOG I get this message (for the FileTable):

2014-08-20 06:32:09.48 spid29s     Warning: No appropriate filter was found during full-text index population for table or indexed view '[text_storage].[dbo].[storage_table]' (table or indexed view ID '355584405', database ID '7'), full-text key value '篰磧'. Some columns of the row were not indexed.

还有一个(用于FileStream表):

And this one (for the FileStream table):

2014-08-19 22:14:50.58 spid20s     Warning: No appropriate filter was found during full-text index population for table or indexed view '[text_storage].[dbo].[storage]' (table or indexed view ID '674101442', database ID '7'), full-text key value '1797'. Some columns of the row were not indexed.

推荐答案

我终于找到了解决方案,在尝试使用相同的错误消息尝试Adobe和Foxit Ifilter之后,我发现了另一个名为"

I've finally found a solution, after trying both Adobe and Foxit Ifilter with the same error message, I found this other Ifilter called "PDFlib", I downloaded it and followed its instructions to make it available to SQL Server, rebuilt the index and now my pdfs are indexed and can be searched.

相信,如果我对其他ifilter遵循相同的说明,它们也将正常工作,请在完成测试并尝试更新结果后再试一次.

I believe that if I follow these same instructions for the other ifilters they will work as well, gonna try that after I'm done with my tests and update with the results.

这篇关于全文索引,不从pdf文件流返回任何结果的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆