如何仅在Elasticsearch上添加新文档或更改过的文档? [英] How to add only new docs or changed docs on elasticsearch?

查看:40
本文介绍了如何仅在Elasticsearch上添加新文档或更改过的文档?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

场景::脚本从外部API提取数据,将结果格式化为字典/json对象,然后将数据推送到elasticsearch.该脚本计划定期运行.

Scenario: Script pulls data from an external API, formats the results as a dictionary/json object, and pushes the data to elasticsearch. The script is scheduled to run periodically.

条件::脚本仅应将字典推入Elasticsearch中尚不存在的记录.对于Elasticsearch中存在的记录,如果任何数据已更改,请更新字段.

Conditions: The script should only push the dictionaries for records that do not already exist in elasticsearch. And for records that exist in elasticsearch, update fields if any data has been changed.

我的方法:来自API的记录具有一个ID,我可以通过执行搜索查询来检查它们是否存在于Elasticsearch中.我列出了Elasticsearch中不存在的ID列表,并将相应的记录推送到elasticsearch.

My Approach: The records from the API have an ID which I use to check if they exist in elasticsearch by doing a search query. I make a list of IDs that do not exist in elasticsearch and push the corresponding records to elasticsearch.

问题:例如,如果具有 {'ID':1的记录,则'Status':'Started'} 昨天被推送到了Elasticsearch.现在数据已更改为 {'ID':1,'Status':'Completed'} ,由于我仅检查ID,因此仍将被忽略.

Issue: For example, if record with {'ID':1, 'Status':'Started'} was pushed to elasticsearch yesterday. Now the data has changed to {'ID':1, 'Status':'Completed'} it will still be ignored because I am checking only the ID.

我正在考虑的解决方案:通过比较json对象/字典的所有字段,将其插入elasticsearch.如果一切都匹配,请跳过插入.如果任何字段具有不同的值,请插入elasticsearch [对于同一条记录有多个文档的冗余不是问题.需要避免为同一记录使用多个具有相同值的多个文档的冗余.]

Solution that I am thinking of: Insert into elasticsearch by comparing all the fields of the json object/dictionary. If everything matches, skip insertion. If any field has different value insert into elasticsearch [Redundancy of having multiple docs for the same record is not an issue. Redundancy of having multiple docs for the same record with all the same values needs to be avoided.]

推荐答案

您可以将文档ID传递给

You can pass the document ID to the index method. This will insert the record if it doesn't exist or it will update any fields that are different. This way you don't need to add custom logic to manage that ID as a regular field.

这篇关于如何仅在Elasticsearch上添加新文档或更改过的文档?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆