如何从ElasticSearch绝对删除某些内容? [英] How to absolutely delete something from ElasticSearch?

查看:424
本文介绍了如何从ElasticSearch绝对删除某些内容?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我们使用ELK堆栈进行日志记录.我被要求设计一个流程,以解决如何删除意外记录的敏感信息.

We use an ELK stack for our logging. I've been asked to design a process for how we would remove sensitive information that had been logged accidentally.

现在基于我对ElasticSearch(Lucene)如何处理删除内容的了解并更新,数据仍然在索引中,只是不可用.最终将随着索引的合并而被清理,等等.

Now based on my reading around how ElasticSearch (Lucene) handles deletes and updates the data is still in the index just not available. It will ultimately get cleaned up as indexes get merged, etc..

是否存在运行更新(以编辑某些内容)或删除(以删除某些内容)并保证将其删除的过程?

Is there a process to run an update (to redact something) or delete (to remove something) and guarantee its removal?

推荐答案

在更新或删除某些值时,ES会将当前文档标记为已删除并为新文档建立索引.删除的值仍将在索引中可用,但绝不会从搜索中找回.当然,如果有人可以访问基础索引文件,则他可以使用某些工具( Luke 或类似)以查看索引文件中的内容,并可能看到已删除的敏感数据.

When updating or deleting some value, ES will mark the current document as deleted and index the new document. The deleted value will still be available in the index, but will never get back from a search. Granted, if someone gets access to the underlying index files, he might be able to use some tool (Luke or similar) to view what's inside the index files and potentially see the deleted sensitive data.

确保标记为已删除的文档确实从索引段中删除的唯一方法是

The only way to guarantee that the documents marked as deleted are really deleted from the index segments, is to force a merge of the existing segments.

POST /myindex/_forcemerge?only_expunge_deletes=true

但是请注意,有一个名为index.merge.policy.expunge_deletes_allowed的设置定义了一个阈值,在该阈值以下不会发生强制合并.默认情况下,此阈值设置为10%,因此,如果删除的文档少于10%,则强制合并调用将不执行任何操作.您可能需要降低阈值才能进行删除...或者可能更容易,请确保不要索引敏感信息.

Be aware, though, that there is a setting called index.merge.policy.expunge_deletes_allowed that defines a threshold below which the force merge doesn't happen. By default this threshold is set at 10%, so if you have less than 10% deleted documents, the force merge call won't do anything. You might need to lower the threshold in order for the deletion to happen... or maybe easier, make sure to not index sensitive information.

这篇关于如何从ElasticSearch绝对删除某些内容?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆