如何加快Elasticsearch的恢复速度? [英] How to speed up Elasticsearch recovery?
问题描述
我正在研究6B小文档的ES群集,这些文档以6.5K索引进行组织,总共6TB.索引在7台服务器之间复制和分片. 索引占用量从几KB到几百GB不等.
I'm working on ES cluster of 6B of small documents, organized in 6.5K indexes, for a total of 6TB. The indexes are replicated and sharded among 7 servers. The indexes occupancy varies from few KB to hundreds of GB.
在使用ES之前,我曾在相同的文档组织中使用过Lucene.
Before using ES, I used Lucene with the same documents organization.
基于 Lucene的应用程序的恢复非常迅速.实际上,当查询到达时,索引是延迟加载的,然后将IndexReader缓存起来,以加快以后的回复速度.
The recovery of the Lucene based application was quite immediate. In fact, the indexes were lazy loaded when a query arrived and then the IndexReader were cached, to speed up future replies.
现在,使用Elasticsearch,恢复非常缓慢(数十分钟).请注意,通常在崩溃之前,所有索引都会打开,并且大多数索引会经常接收要建立索引的文档.
Now, with Elasticsearch, the recovery is very slow (tens of minutes). Note that usually before a crash, all the indexes are opened and that most of them receive documents to index quite often.
是否有任何好的方法可以减少ES恢复时间? 我还对与索引管理相关的任何事物都感兴趣,而不仅仅是与配置有关. 例如,我想更快地恢复最重要的索引,然后加载所有其他索引;这样,我可以减少大多数用户的停机时间.
Is there any good pattern to reduce the ES recovery time? I'm also interested in anything related the index management and not only about the configuration. For example, I would like to recovery faster the most important indexes and then load all the others; by doing so, I can reduce the perceived downtime for most of the users.
我正在使用以下配置:
#Max number of indices cuncurrently loaded at startup
indices.recovery.concurrent_streams: 80
#Max number of bytes cuncurrently readed at startup for loading the indices
indices.recovery.max_bytes_per_sec: 250mb
#Allow to control specifically the number of initial recoveries of primaries that are allowed per node
cluster.routing.allocation.node_initial_primaries_recoveries: 20
#Max number of indices cuncurrently loaded at startup
cluster.routing.allocation.node_concurrent_recoveries: 80
#the number of streams to open (on a node level) for small files (under 5mb) to recover a shard from a peer shard
indices.recovery.concurrent_small_file_streams: 30
PS:现在我正在使用ES 2.4.1,但是我将在几周后使用ES 5.2. PPS:一种情况可能是停电后的恢复.
PS: Right now I'm using ES 2.4.1, but I will use ES 5.2 in a few weeks. PPS: A scenario could be a recovery after a blackout.
谢谢!
推荐答案
编辑要在某些索引上优先进行恢复,可以通过以下方式在索引上使用优先级设置:
Edit To prioritize recovery on certain indices, you can use the priority setting on index this way:
PUT some_index
{
"settings": {
"index.priority": 10
}
}
将首先恢复优先级最高的索引,否则按索引的创建时间对恢复进行排序,请参见
The index with the biggest priority will be recovered first, otherwise the recovery is ordered by creation time of the index, see this
第二次编辑:要更改副本数,您只需要一个HTTP请求:
Second Edit To change the number of replicas, you simply need a HTTP request:
PUT index_name/_settings
{
"index":{
"number_of_replicas" : "0"
}
}
关于快照恢复,我建议以下几点(某些情况可能不适用于您的情况):
Regarding snapshot recovery, I would suggest the following points (some might not be applicable to your case):
- 在恢复之前将副本数设置为0,然后将其交换回其默认值(减少写入)
- 如果使用旋转磁盘,则可以添加到elasticsearch.yml以提高索引速度:
index.merge.scheduler.max_thread_count: 1
(请参阅,然后将其恢复为默认值(请参见
- put the number of replicas at 0 before the recovery then swap it back to its default value(less writing)
- if using spinning disk, you can add to the elasticsearch.yml to increase the indexing speed:
index.merge.scheduler.max_thread_count: 1
(see here) - Update before recovery your index settings with:
"refresh_interval" : "-1"
and put it back at its default value afterward(see the doc)
如果您还不关心搜索,则ES5群集上的以下内容也可能会有所帮助:
If you don't care about searching yet, the following on your ES5 cluster could also help:
PUT /_cluster/settings
{
"transient" : {
"indices.store.throttle.type" : "none"
}
}
以下几篇文章可能会有所帮助:
A few articles below that could help:
- https://www.elastic.co/guide/en/elasticsearch/reference/5.x/tune-for-indexing-speed.html
- https://www.elastic.co/guide/en/elasticsearch/reference/5.x/tune-for-disk-usage.html
- https://www.elastic.co/guide/en/elasticsearch/reference/5.x/tune-for-indexing-speed.html
- https://www.elastic.co/guide/en/elasticsearch/reference/5.x/tune-for-disk-usage.html
一些一般性提示:确保已禁用交换功能. ES群集中的节点分配了多少内存? (由于jvm的内存寻址限制问题,您应该使用节点总可用内存的一半,上限为32 GB.)
A few general tips: be sure you have swapping disable. How much memory is allocated to your nodes in the ES cluster? (You should use half of the total available memory of a node, with a cap at 32 GB due to some memory addressing limit issue of jvms).
这篇关于如何加快Elasticsearch的恢复速度?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!