我们什么时候需要使用Elasticsearch的大堆? [英] When do we need large heap with Elasticsearch?

查看:320
本文介绍了我们什么时候需要使用Elasticsearch的大堆?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

运行ES 1.5.2
JAVA 1.8_45
Windows 2008
32个核心128GB RAM 5TB SSD(每台机器)的4个节点。



我的目标是索引约25亿份文件。我高达8.1亿。我现在有ES_HEAP_SIZE = 30g



但是我已经体验到大量的内存压力STW暂停。示例:目前,一个节点始终在90%以上的堆使用率,而其余节点在30%到40%之间的任何地方都会随意滑动。所以似乎1节点没有GC?



只有两件事情发生在群集批量索引(没有错误)记录和一些滚动搜索。



我可以使用doc值。目前没有字段数据缓存(除了奇迹verry小)和过滤器缓存是非常小的每个节点约100MB。



节点仍在尝试恢复,所以我只是' t要完全停止集群并将RAM重置为10GB?

 如何在批量和滚动中连接到集群搜索... 

//在应用程序启动时执行此操作,并重新使用客户端实例。
设置设置= ImmutableSettings
.settingsBuilder()
.put(cluster.name,xxxx)
.build();

client = new TransportClient(settings)
.addTransportAddress(new InetSocketTransportAddress(xxxx,9300))
.addTransportAddress(new InetSocketTransportAddress(xxxx,9300))
.addTransportAddress(new InetSocketTransportAddress(xxxx,9300))
.addTransportAddress(new InetSocketTransportAddress(xxxx,9300));


解决方案

不要将批量请求发送到一个节点。搜索请求也是一样。



批量请求保存在接收请求的节点上的内存缓冲区中,显然,发送不是一个好主意对一个节点的任何类型的请求。通过使用代理服务器(如果您有)或通过使用客户端节点,并将请求发送到该节点。客户端节点知道如何执行循环机制。



还可以查看其他选项(取决于访问集群的客户端),看看是否客户端支持自动循环/负载平衡请求。


Running ES 1.5.2 JAVA 1.8_45 Windows 2008 4 nodes of 32 Core 128gb RAM 5TB SSDs (Per machine).

My goal is to index about 2.5 billion documents. I am up to 810 million. 30k average per doc.

I currently have ES_HEAP_SIZE=30g

But I have been experience lots of memory pressure and STW pauses. Example: Currently one node is always above 90% heap usage while the rest are coasting anywhere between 30% and 40%. So it seems that 1 node wont GC???

Only 2 things are happening on the cluster bulk indexing (no errors) logged and some scroll searches.

Using doc value where I can. Currently there's no field data cache (except marvel verry small) and filter cache is very minimal about 100MB per node.

The nodes are still trying to recover so i just don't want to stop the cluster fully and reset the RAM to 10GB??

How I connect to the cluster in both bulk and scroll search...

// Do this once at application startup and re-use the client instance.
Settings settings = ImmutableSettings
    .settingsBuilder()
    .put("cluster.name", "xxxx")
    .build();

    client = new TransportClient(settings)
        .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300))
        .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300))
        .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300))
        .addTransportAddress(new InetSocketTransportAddress("xxxx", 9300));

解决方案

Don't send the bulk requests only to one node. The same goes for the search requests.

The bulk request is kept in a memory buffer on the node that receives the request and, obviously, is not a good idea to send any kind of requests to just one node. Round robin the requests either by using a proxy server (if you have one), or by using a client node and send the requests to that node. The client node knows how to do the round-robin mechanism.

You can, also, look at other options (depending on the clients accessing the cluster) and see if those clients support automatic round-robin/load balancing the requests.

这篇关于我们什么时候需要使用Elasticsearch的大堆?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆