Apache的火花内存消耗的高速缓存（）/坚持（） [英] apache-spark memory consumption for cache() / persist()

查看：277 发布时间：2016/5/22 15:50:21 java garbage-collection apache-spark

本文介绍了Apache的火花内存消耗的高速缓存（）/坚持（）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

当我尝试缓存（）或持续（MEMORY_ONLY_SER（））我RDDS我的火花集群挂起。它的伟大工程，并在大约7分钟计算的结果。如果我不使用缓存（）。

My spark cluster hangs when I try to cache() or persist(MEMORY_ONLY_SER()) my RDDs. It works great and computes results in about 7min. if I don't use cache().

我有6 c3.xlarge EC2实例（4核，7.5 GB RAM每个），37.7 GB。

I've got 6 c3.xlarge EC2 instances (4 cores, 7.5 GB RAM each), which gives in total 24 cores and 37.7 GB.

我在主机上运行我用下面的命令应用程序：

I run my application with the following command on master:

SPARK_MEM =5克MEMORY_FRACTION =0.6SPARK_HOME =/根/火花java命令./uber-offline.jar:/root/spark/assembly/target/scala-2.10/spark-assembly_2.10-0.9.0-incubating-hadoop1.0.4.jar pl.instream.dsp.offline.OfflineAnalysis

SPARK_MEM=5g MEMORY_FRACTION="0.6" SPARK_HOME="/root/spark" java -cp ./uber-offline.jar:/root/spark/assembly/target/scala-2.10/spark-assembly_2.10-0.9.0-incubating-hadoop1.0.4.jar pl.instream.dsp.offline.OfflineAnalysis

该数据集是关于划分成24个文件50GB的数据。我COM pressed并存储在S3存储桶的24个文件（其中每一个有7MB的大小300MB）。

The data set is about 50GB of data partitioned into 24 files. I compressed it and stored in S3 bucket in 24 files (where each of it has size of 7MB to 300MB).

我绝对不能找到我的集群的这种行为的理由，但似乎像火花消耗所有可用内存，并钻进GC收集循环。当我看着GC冗长，我能找到一个周期象下面这样：

I absolutely can't find a reason for such behaviour of my cluster, but it seems, like spark consumed all available memory and got into GC collecting loop. When I look into gc verbose, I can find a cycles like below:

[GC 5208198K(5208832K), 0,2403780 secs]
[Full GC 5208831K->5208212K(5208832K), 9,8765730 secs]
[Full GC 5208829K->5208238K(5208832K), 9,7567820 secs]
[Full GC 5208829K->5208295K(5208832K), 9,7629460 secs]
[GC 5208301K(5208832K), 0,2403480 secs]
[Full GC 5208831K->5208344K(5208832K), 9,7497710 secs]
[Full GC 5208829K->5208366K(5208832K), 9,7542880 secs]
[Full GC 5208831K->5208415K(5208832K), 9,7574860 secs]

这最终导致了信息，如：

This finally leads to the messages like:

WARN storage.BlockManagerMasterActor: Removing BlockManager BlockManagerId(0, ip-xx-xx-xxx-xxx.eu-west-1.compute.internal, 60048, 0) with no recent heart beats: 64828ms exceeds 45000ms

...并停止在计算任何进展。这看起来像内存在100％的消耗，但我想用机器与更多的RAM（如每30GB），效果是一样的。

...and stops any progress in computing. This looks like the memory was consumed in 100%, but I tried to use machines with more RAM (like 30GB each), and the effect is the same.

什么可能是这种行为的原因？任何人可以帮助？

What might be the reason of such behaviour?? Could anybody help??

Apache的火花内存消耗的高速缓存（）/坚持（） [英] apache-spark memory consumption for cache() / persist()

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

Apache的火花内存消耗的高速缓存（）/坚持（） [英] apache-spark memory consumption for cache() / persist()

问题描述

推荐答案

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭