优化API以减少细分并消除ES删除的文档不起作用 [英] Optimize API for reducing the segments and eliminating ES deleted docs not working

查看:73
本文介绍了优化API以减少细分并消除ES删除的文档不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我上一个问题的继续是否有大量的已删除文档数会影响ES查询性能,与我的ES索引中的已删除文档有关.

This is in continuation of my previous question Does huge number of deleted doc count affects ES query performance related to deleted docs in my ES index.

正如答案中指出的那样,我使用优化API 时,ES 1.X版本,其中强制合并API 不可用,但是在阅读了Elastic的创始人Say Bannon的关于optimize API github链接(先前在ES站点上找不到该链接)后,看起来像在做同样的工作.

As pointed in the answer, I used optimize API as I am using the ES 1.X version where force merge API is not available but after reading about optimize API github link(provided earlier as couldn't find it on ES site) by Say Bannon founder of elastic, looks like it does the same work.

运行优化API后,我得到了索引成功的消息,但是我看不到已删除文档的总数减少,因此我担心,当我使用

I got the success message for my index after running the optimize API, but I don't see total count of deleted docs decreasing and I am worried as when I checked the segments of my index using segments API, I see there are more than 25 segments for each shard and every shard is holding 250-1 gb of data in memory and almost 500k docs, while I see there are some shards where there is few deleted docs.

所以我的问题是:

  1. 我的索引在多个数据节点上有多个分片,当我仅使用1个节点URL运行优化API时,它是否仅合并了该节点上的段?
  2. 在段API结果中,它显示的节点ID类似于"node":"f2hsqeamadnaskda" ,而我正在使用KOPF插件并为我的数据节点提供了自定义名称,所以我该如何关联它密码节点名称改为我可读的节点名称,以识别语句1是否正确?
  3. 由于没有关于optimize API的文档,是否可以在单个镜头中合并所有节点上所有分片上的段?我需要在应用索引之前将其设置为只读吗?
  1. My index is having multiple shards across multiple data nodes and when I ran optimize API using only 1 node URL, then does it only merges the segments on that node?
  2. In segment API result it shows the node-id like "node": "f2hsqeamadnaskda", while I am using KOPF plugin and have custom names for my data nodes, so How can I relate this cryptic node name to my human readable node name to identify whether statement 1 is correct or not?
  3. As there is no documentation available on optimize API, is it possible to merge segments on all shards across all nodes in single shot? and do I need to make index read-only before applying it?

推荐答案

@Nirmal已经回答了您的前两个问题,所以:

@Nirmal has answered your first two questions, so:

  1. 由于没有关于optimize API的文档,是否可以在单个镜头中合并所有节点上所有分片上的段?我需要在应用索引之前将其设置为只读吗?

有适用于1.x的文档:https://www.elastic.co/guide/zh-CN/elasticsearch/reference/1.7/indices-optimize.html .您可能正在寻找像这样的电话:

There is documentation available for 1.x: https://www.elastic.co/guide/en/elasticsearch/reference/1.7/indices-optimize.html. You are probably looking for calls like these:

  • GET< index_pattern>/_ cat/segments :列出所有分片中的所有片段(可以是数千个).还列出了已删除的文档.
  • POST< index_pattern>/_ optimize?max_num_segments = 1 :强制将所有段合并为每个分片1个单个段.当不再写入索引时,请执行此操作.它有助于减少数据节点上CPU/RAM上的负载.
  • POST< index_pattern>/_ optimize?only_expunge_deletes = true :仅删除已删除的文档
  • GET <index_pattern>/_cat/segments: List all segments in all the shards (can be thousands). Also lists deleted docs.
  • POST <index_pattern>/_optimize?max_num_segments=1: Force merge all segments to 1 single segment per shard. Do this when the index is no longer being written to. It helps to reduce load on CPU/RAM on the data nodes.
  • POST <index_pattern>/_optimize?only_expunge_deletes=true: only remove deleted docs

最后,您可以将 * 用作< index_pattern> 来仅对整个群集执行所有索引.

Finally, you can use * as <index_pattern> to just do all indices on the whole cluster.

这篇关于优化API以减少细分并消除ES删除的文档不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆