我们什么时候需要使用Elasticsearch的大堆？ [英] When do we need large heap with Elasticsearch?

查看：320 发布时间：2017/8/7 4:41:02 elasticsearch

本文介绍了我们什么时候需要使用Elasticsearch的大堆？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

运行ES 1.5.2
JAVA 1.8_45
Windows 2008
32个核心128GB RAM 5TB SSD（每台机器）的4个节点。

我的目标是索引约25亿份文件。我高达8.1亿。我现在有ES_HEAP_SIZE = 30g

但是我已经体验到大量的内存压力STW暂停。示例：目前，一个节点始终在90％以上的堆使用率，而其余节点在30％到40％之间的任何地方都会随意滑动。所以似乎1节点没有GC？

只有两件事情发生在群集批量索引（没有错误）记录和一些滚动搜索。

我可以使用doc值。目前没有字段数据缓存（除了奇迹verry小）和过滤器缓存是非常小的每个节点约100MB。

节点仍在尝试恢复，所以我只是' t要完全停止集群并将RAM重置为10GB？

 如何在批量和滚动中连接到集群搜索... 
 
 //在应用程序启动时执行此操作，并重新使用客户端实例。 
设置设置= ImmutableSettings 
 .settingsBuilder（）
 .put（cluster.name，xxxx）
 .build（）; 
 
 client = new TransportClient（settings）
 .addTransportAddress（new InetSocketTransportAddress（xxxx，9300））
 .addTransportAddress（new InetSocketTransportAddress（xxxx，9300））
 .addTransportAddress（new InetSocketTransportAddress（xxxx，9300））
 .addTransportAddress（new InetSocketTransportAddress（xxxx，9300））;

解决方案

不要将批量请求发送到一个节点。搜索请求也是一样。

批量请求保存在接收请求的节点上的内存缓冲区中，显然，发送不是一个好主意对一个节点的任何类型的请求。通过使用代理服务器（如果您有）或通过使用客户端节点，并将请求发送到该节点。客户端节点知道如何执行循环机制。

还可以查看其他选项（取决于访问集群的客户端），看看是否客户端支持自动循环/负载平衡请求。

Running ES 1.5.2 JAVA 1.8_45 Windows 2008 4 nodes of 32 Core 128gb RAM 5TB SSDs (Per machine).

My goal is to index about 2.5 billion documents. I am up to 810 million. 30k average per doc.

I currently have ES_HEAP_SIZE=30g

But I have been experience lots of memory pressure and STW pauses. Example: Currently one node is always above 90% heap usage while the rest are coasting anywhere between 30% and 40%. So it seems that 1 node wont GC???

Only 2 things are happening on the cluster bulk indexing (no errors) logged and some scroll searches.

Using doc value where I can. Currently there's no field data cache (except marvel verry small) and filter cache is very minimal about 100MB per node.

The nodes are still trying to recover so i just don't want to stop the cluster fully and reset the RAM to 10GB??

How I connect to the cluster in both bulk and scroll search...

// Do this once at application startup and re-use the client instance.
Settings settings = ImmutableSettings
    .settingsBuilder()
    .put("cluster.name", "xxxx")
    .build();

    client = new TransportClient(settings)
        .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300))
        .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300))
        .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300))
        .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300));

解决方案

Don't send the bulk requests only to one node. The same goes for the search requests.

The bulk request is kept in a memory buffer on the node that receives the request and, obviously, is not a good idea to send any kind of requests to just one node. Round robin the requests either by using a proxy server (if you have one), or by using a client node and send the requests to that node. The client node knows how to do the round-robin mechanism.

You can, also, look at other options (depending on the clients accessing the cluster) and see if those clients support automatic round-robin/load balancing the requests.

这篇关于我们什么时候需要使用Elasticsearch的大堆？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

我们什么时候需要使用Elasticsearch的大堆？ [英] When do we need large heap with Elasticsearch?

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

我们什么时候需要使用Elasticsearch的大堆？ [英] When do we need large heap with Elasticsearch?

问题描述

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭