推送文档(blob)以建立索引-Azure搜索 [英] Pushing documents(blobs) for indexing - Azure Search

查看:125
本文介绍了推送文档(blob)以建立索引-Azure搜索的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经在Azure搜索+ Azure Blob存储中工作了一段时间,但在为上载的新文件的增量更改编制索引时遇到了麻烦.

I've been working in Azure Search + Azure Blob Storage for while, and I'm getting trouble indexing the incremental changes for new files uploaded.

将新文件上传到我的Blob容器后,如何刷新索引?上传文件后按照我的步骤进行操作(我正在使用Rest服务执行这些操作):我正在使用Microsoft Azure存储资源管理器 [链接] .

How can I refresh the index after upload a new file into my blob container? Following my steps after upload file(I'm using rest service to perform these actions): I'm using the Microsoft Azure Storage Explorer [link].

通过此应用程序,我已将新文件上传到之前已经创建的文件夹中.之后,我使用Http REST执行运行"索引器命令,您可以在此

Through this App I've uploaded my new file to a folder already created before. After that, I used the Http REST to perform a 'Run' indexer command, you can see in this [link].

索引器显示我已经成功添加了新文件,但是当我去搜索时,找不到该新文件中的内容.

The indexer shows me that my new file was successfully added, but when I go to search the content in this new file is not found.

请问,有人知道如何在索引中添加此新文件,以及如何通过搜索其内容来查找该新文件吗?

Please, anybody knows how to add this new file in Index and also how to find this new file by searching for his content?

我正在关注Microsoft教程,但是对于这个问题,我找不到解决方案.

I'm following Microsoft tutorials, but for this issue, I couldn't find a solution.

谢谢,伙计们!

推荐答案

我将尝试描述如何解决此问题.

I'll try to describe how can I figured out this issue.

首先,我通过以下命令创建了一个数据源:

Firstly, I've created a DataSource through this command:

POST https://[service name].search.windows.net/datasources?api-version=[api-version]

https://docs.microsoft.com/en-us/rest/api/searchservice/create-data-source .

第二,我创建了索引:

POST https://[servicename].search.windows.net/indexes?api-version=[api-version] 

https://docs.microsoft.com/en -us/rest/api/searchservice/create-index

最后,我创建了索引器.由于设置了所有配置,因此此刻发生了问题.

Finally, I created the Indexer. The problem happened at this moment because it is where all configurations are setted.

POST https://[service name].search.windows.net/indexers?api-version=[api-version]

https://docs.microsoft.com/en -us/rest/api/searchservice/create-indexer

完成所有这些操作之后.索引开始自动索引所有内容(一旦我们将内容存储到Blob存储中).

After all these things done. The Index starts indexing all contents automatically (once we have contents into blob storage).

关键时刻到了.当索引试图将所有文本"提取到文件中时,当文件类型不可索引"时,可能会发生一些问题.例如,您必须注意两个属性,即排除的扩展名索引扩展名.

The crucial thing comes now. while your index is trying to extract all 'text' into your files, could occur some issue when the type of file is not 'indexable'. For example, there are two properties that you must pay attention excluded extensions, indexed extensions.

如果您没有正确编写类型,则Index会引发异常.然后,反馈消息(我认为不好,就像一个失误线索")说,要避免此错误,应将索引器设置为""dataToExtract" : "storageMetadata"".

If you don't write the types properly, the Index throws an exception. Then, The Feedback Message(in my opinion is not good, was like a 'miss lead') says to avoid this error you should set the Indexer to '"dataToExtract" : "storageMetadata"'.

此命令意味着您只尝试索引元数据而不再索引文件的内容,那么您将无法以此进行搜索和检索.

This command means that you are trying just index the metadata and no more the content of your files, then you cannot search by this and retrieve.

此后,底部的同一条消息说要避免这些问题,您应该设置两个属性(谁解决了该问题)

After that, the same message at the bottom says to avoid these issue you should set two properties (who solved the problem)

"failOnUnprocessableDocument" : false,"failOnUnsupportedContentType" : false

此外,现在一切正常.感谢您@Eugene Shvets的帮助,希望对其他人有用.

In addition, now everything is working properly. I appreciate your help @Eugene Shvets, and I hope this could be useful for someone else.

这篇关于推送文档(blob)以建立索引-Azure搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆