使用Logstash从Elastic Search中删除旧文档 [英] Delete old documents from Elastic Search using logstash

查看:927
本文介绍了使用Logstash从Elastic Search中删除旧文档的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用logstash将数据从postgres(jdbc输入插件)索引到elasticsearch.我在数据库中没有任何基于时间的信息. 用户要导入的Postgres表有2列-userid(unique),uname 弹性搜索导出-_id =用户ID 我每小时都会在Logstash中使用Cron Schedule导出此数据.

I am using logstash to index data from postgres(jdbc input plugin) into elasticsearch. I don't have any time based information in the database. Postgres table users to import has 2 columns - userid(unique), uname Elastic search export - _id = userid I am exporting this data every hour using cron schedule in logstash.

input {
     jdbc {
         schedule => "0 */1 * * *"
         statement => "SELECT userid, uname FROM users"
     }
}
output {
     elasticsearch {
        hosts => ["elastic_search_host"]
        index => "user_data"
        document_id => "%{userid}"
    }
}

此logstash配置正确索引数据.但是,它仅适用于更新和插入案例.如果从表中删除了任何数据/用户信息,它将不会从弹性搜索索引中删除文档.有人可以帮我处理删除案例吗?

This logstash config indexes data correctly. But, it works only for update and insert cases. If any data/user info is deleted from table, it will not delete the document from the elastic search index. Can someone please help me with the delete case?

推荐答案

logstash中没有可用的即开即用选项来实现预期的结果.

There is no out of the box option available in logstash to achieve your intended outcome.

https://discuss.elastic .co/t/delete-elasticsearch-document-with-logstash-jdbc-input/47490 -如此处所述,您可以添加状态"列,并将该条目标记为已删除,而不是删除该条目

https://discuss.elastic.co/t/delete-elasticsearch-document-with-logstash-jdbc-input/47490 - as mentioned here, you can add a "status" column, and flag the entry as deleted, instead of deleting the entry.

另一种方法是每小时删除一次索引,然后让logstash完成.当Elasticsearch中没有数据时,这将是一个非常短的持续时间.

Another way to do it would be to delete your index every hour and then let logstash do it’s thing. There will be a very brief duration when there will be no data in Elasticsearch.

为避免这种情况,您可以配置logstash以每小时为ex索引到一个新索引. user_data-timestamp,然后使用策展人等在外部删除较旧的索引

To avoid that, instead you can configure logstash to index to a new index every hour for ex. user_data-timestamp and then delete older indices externally using curator etc

这篇关于使用Logstash从Elastic Search中删除旧文档的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆