写重弹力搜索 [英] Write heavy elasticsearch
问题描述
我在一大块亚马逊ec2上运行ES。
我已经调整了属性如下:
indices.memory.index_buffer_size:30%
索引。 memory.min_shard_index_buffer_size:30mb
indices.memory.min_index_buffer_size:96mb
threadpool.bulk.type:fixed
threadpool.bulk.size:100
threadpool.bulk。 queue_size:2000
bootstrap.mlockall:true
但是我想要以50Ks而不是10K的顺序写入性能,以确保我的风暴拓扑的正常流动。任何人都可以建议如何配置重写优化的ES群集。
脚本可以帮助您提高索引性能。有很多选项和配置可以尝试,我写了一些 here 但是这不是一个全面的列表。减少副本和增加分片可以提高索引性能,但可以降低索引时的可用性和搜索性能。
可能将HTTP批量请求发送到多个节点而不仅仅是主节点可以帮助您获得所需的数字。
希望这有所帮助。 10k / ps的插入效果比大多数人都好,但是他们是否使用了我不知道的大型Amazon实例。
I am writing a real time analytics tool using kafka,storm and elasticsearch and want a elasticsearch that is write optimized for about 50K/sec inserts. For the purpose of POC I tried inserting bulk documents into the elasticsearch attaining 10K inserts per seconds.
I am running ES on a large box of amazon ec2. I have tweaked the properties as below:
indices.memory.index_buffer_size: 30%
indices.memory.min_shard_index_buffer_size: 30mb
indices.memory.min_index_buffer_size: 96mb
threadpool.bulk.type: fixed
threadpool.bulk.size: 100
threadpool.bulk.queue_size: 2000
bootstrap.mlockall: true
But I want write performance in order of 50Ks and not 10Ks to ensure the normal flow of my storm topology. Can anyone suggest how to configure a heavy write optimized ES cluster.
The scripts located here may help you improve indexing performance. There are many options and configurations to try, I write about some here however this isn't a comprehensive list. Reducing replicas and increasing shards increases indexing performance but however reduces availability and searching performance during indexing.
Perhaps sending HTTP bulk requests to several nodes rather than just the master node could help you get the figures you desire.
Hope this helps somewhat. 10k/ps inserts is good better than what most people have achieved however whether they get to use a large Amazon instance I don't know.
这篇关于写重弹力搜索的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!