Spark:shuffle 操作导致长时间的 GC 暂停 [英] Spark: shuffle operation leading to long GC pause

查看：37 发布时间：2021/11/14 22:22:40 scala apache-spark garbage-collection apache-spark-sql g1gc

本文介绍了Spark:shuffle 操作导致长时间的 GC 暂停的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在运行 Spark 2 并尝试对 5 TB 左右的 json 进行随机播放.我在 Dataset 的改组过程中遇到了很长的垃圾收集暂停:

I'm running Spark 2 and am trying to shuffle around 5 terabytes of json. I'm running into very long garbage collection pauses during shuffling of a Dataset:

val operations = spark.read.json(inPath).as[MyClass]
operations.repartition(partitions, operations("id")).write.parquet("s3a://foo")

是否有任何明显的配置调整来处理这个问题?我的配置如下:

Are there any obvious configuration tweaks to deal with this issue? My configuration is as follows:

spark.driver.maxResultSize 6G
spark.driver.memory 10G
spark.executor.extraJavaOptions -XX:+UseG1GC -XX:MaxPermSize=1G -XX:+HeapDumpOnOutOfMemoryError
spark.executor.memory   32G
spark.hadoop.fs.s3a.buffer.dir  /raid0/spark
spark.hadoop.fs.s3n.buffer.dir  /raid0/spark
spark.hadoop.fs.s3n.multipart.uploads.enabled   true
spark.hadoop.parquet.block.size 2147483648
spark.hadoop.parquet.enable.summary-metadata    false
spark.local.dir /raid0/spark
spark.memory.fraction 0.8
spark.mesos.coarse  true
spark.mesos.constraints  priority:1
spark.mesos.executor.memoryOverhead 16000
spark.network.timeout   600
spark.rpc.message.maxSize    1000
spark.speculation   false
spark.sql.parquet.mergeSchema   false
spark.sql.planner.externalSort  true
spark.submit.deployMode client
spark.task.cpus 1

推荐答案

添加以下标志摆脱了 GC 暂停.

Adding the following flags got rid of the GC pauses.

spark.executor.extraJavaOptions -XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=35 -XX:ConcGCThreads=12

我认为这确实需要进行大量调整.这篇 databricks 帖子非常非常有帮助.

I think it does take a fair amount of tweaking though. This databricks post was very very helpful.

这篇关于Spark:shuffle 操作导致长时间的 GC 暂停的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Spark:shuffle 操作导致长时间的 GC 暂停 [英] Spark: shuffle operation leading to long GC pause

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Spark:shuffle 操作导致长时间的 GC 暂停 [英] Spark: shuffle operation leading to long GC pause

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭