Solr,块更新现有文档 [英] Solr, block updating of existing document

查看:114
本文介绍了Solr,块更新现有文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



但是,当我将文档发送到solr,并且这样的文档已经存在于索引中(以其ID为单位)时,想要自动替换文档。只要忽略并继续下一步。我如何配置solr。



当然,我可以查询solr来检查它是否已经有文档,但是对我来说这是坏的,因为我进行批量更新,这将使过程和增加请求数量。



所以有什么办法配置solr忽略重复?

解决方案

您可以使用相同的uniqueIndex自动覆盖文档,在发送文档时,在add元素中指定属性 overwrite =false UpdateHandler 。看看 here

 < add overwrite =false> 
< doc>
< field name =id> id< / field>
< / doc>
< / add>

无论如何允许将重复的文档复制到solr ,而不是跳过新的具有相同ID的文档。我不认为这是你想要的行为。



我认为你应该写自己的 UpdateHandler UpdateRequestProcessor 或遵循从 solr用户邮件列表


When a document is sent to solr and such document already exists in the index (by its ID) then the new one replaces old one.

But I don't want to automatically replace documents. Just ignore and proceed to the next. How can I configure solr.

Of course I can query solr to check if it has the document already but it's bad for me since I do bulk updates and this will complicate the process and increase amount of request.

So are there any ways to configure solr to ignore duplicates?

解决方案

You can disable the automatic overwriting of documents with the same uniqueIndex specifying the attribute overwrite="false" within the add element while you send documents to the UpdateHandler. Have a look here.

<add overwrite="false">
    <doc>
        <field name="id">id</field>
    </doc>
</add>

Anyway this allows to have duplicate documents into solr, instead of skipping new documents with same id of existing ones. I don't think this is your desired behaviour.

I think you should write your own UpdateHandler or UpdateRequestProcessor or follow the suggestions you got from the solr user mailing list.

这篇关于Solr,块更新现有文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆