如何加快 Elasticsearch 的恢复速度? [英] How to speed up Elasticsearch recovery?

查看:103
本文介绍了如何加快 Elasticsearch 的恢复速度?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在处理 6B 小文档的 ES 集群,以 6.5K 索引组织,总共 6TB.索引在 7 个服务器之间复制和分片.索引占用率从几 KB 到数百 GB 不等.

I'm working on ES cluster of 6B of small documents, organized in 6.5K indexes, for a total of 6TB. The indexes are replicated and sharded among 7 servers. The indexes occupancy varies from few KB to hundreds of GB.

在使用 ES 之前,我使用的是具有相同文档组织的 Lucene.

Before using ES, I used Lucene with the same documents organization.

基于 Lucene 的应用程序很快就恢复了.事实上,当查询到达时,索引是延迟加载的,然后 IndexReader 被缓存,以加快未来的回复速度.

The recovery of the Lucene based application was quite immediate. In fact, the indexes were lazy loaded when a query arrived and then the IndexReader were cached, to speed up future replies.

现在,使用 Elasticsearch,恢复速度非常慢(几十分钟).请注意,通常在崩溃之前,所有索引都会打开,并且大多数索引会经常接收要索引的文档.

Now, with Elasticsearch, the recovery is very slow (tens of minutes). Note that usually before a crash, all the indexes are opened and that most of them receive documents to index quite often.

有没有什么好的模式可以减少 ES 恢复时间?我也对与索引管理相关的任何事情感兴趣,而不仅仅是关于配置.例如,我想更快地恢复最重要的索引,然后加载所有其他索引;通过这样做,我可以减少大多数用户的停机时间.

Is there any good pattern to reduce the ES recovery time? I'm also interested in anything related the index management and not only about the configuration. For example, I would like to recovery faster the most important indexes and then load all the others; by doing so, I can reduce the perceived downtime for most of the users.

我正在使用以下配置:

#Max number of indices cuncurrently loaded at startup
indices.recovery.concurrent_streams: 80

#Max number of bytes cuncurrently readed at startup for loading the indices
indices.recovery.max_bytes_per_sec: 250mb

#Allow to control specifically the number of initial recoveries of primaries that are allowed per node
cluster.routing.allocation.node_initial_primaries_recoveries: 20

#Max number of indices cuncurrently loaded at startup
cluster.routing.allocation.node_concurrent_recoveries: 80

#the number of streams to open (on a node level) for small files (under 5mb) to recover a shard from a peer shard
indices.recovery.concurrent_small_file_streams: 30

PS:目前我正在使用 ES 2.4.1,但几周后我将使用 ES 5.2.PPS:一个场景可能是停电后的恢复.

PS: Right now I'm using ES 2.4.1, but I will use ES 5.2 in a few weeks. PPS: A scenario could be a recovery after a blackout.

谢谢!

推荐答案

编辑 要优先对某些索引进行恢复,您可以这样使用索引的优先级设置:

Edit To prioritize recovery on certain indices, you can use the priority setting on index this way:

PUT some_index
{
  "settings": {
    "index.priority": 10
  }
}

优先级最高的索引会先被恢复,否则按照索引的创建时间排序,见这个

The index with the biggest priority will be recovered first, otherwise the recovery is ordered by creation time of the index, see this

第二次编辑要更改副本的数量,您只需要一个 HTTP 请求:

Second Edit To change the number of replicas, you simply need a HTTP request:

PUT  index_name/_settings
{
  "index":{
    "number_of_replicas" : "0"
  }
}

<小时>

关于快照恢复,我建议以下几点(有些可能不适用于您的情况):


Regarding snapshot recovery, I would suggest the following points (some might not be applicable to your case):

  • 将recovery前的副本数设为0,然后将其交换回默认值(少写)
  • 如果使用旋转磁盘,您可以在 elasticsearch.yml 中添加以提高索引速度:index.merge.scheduler.max_thread_count: 1(参见 这里)
  • 在恢复索引设置之前更新:"refresh_interval" : "-1" 并在之后将其恢复为默认值(参见 文档)
  • put the number of replicas at 0 before the recovery then swap it back to its default value(less writing)
  • if using spinning disk, you can add to the elasticsearch.yml to increase the indexing speed: index.merge.scheduler.max_thread_count: 1 (see here)
  • Update before recovery your index settings with: "refresh_interval" : "-1" and put it back at its default value afterward(see the doc)

如果您还不关心搜索,您的 ES5 集群上的以下内容也可能会有所帮助:

If you don't care about searching yet, the following on your ES5 cluster could also help:

PUT /_cluster/settings
{
    "transient" : {
        "indices.store.throttle.type" : "none" 
    }
}

以下几篇文章可能会有所帮助:

A few articles below that could help:

一些一般提示:确保您已禁用交换.ES集群中的节点分配了多少内存?(由于 jvm 的一些内存寻址限制问题,您应该使用节点总可用内存的一半,上限为 32 GB).

A few general tips: be sure you have swapping disable. How much memory is allocated to your nodes in the ES cluster? (You should use half of the total available memory of a node, with a cap at 32 GB due to some memory addressing limit issue of jvms).

这篇关于如何加快 Elasticsearch 的恢复速度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆