Apache的火花:在设置spark.eventLog.enabled和spark.eventLog.dir提交或启动星火 [英] Apache spark: setting spark.eventLog.enabled and spark.eventLog.dir at submit or Spark start

查看:4329
本文介绍了Apache的火花:在设置spark.eventLog.enabled和spark.eventLog.dir提交或启动星火的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想设置 spark.eventLog.enabled spark.eventLog.dir 火花提交开始 - 所有级 - 不要求它在斯卡拉/ JAVA /蟒蛇code启用。
 我曾尝试各种事情没有成功:

I would like to set spark.eventLog.enabled and spark.eventLog.dir at the spark-submit or start-all level -- not require it to be enabled in the scala/java/python code. I have tried various things with no success:

spark.eventLog.enabled           true
spark.eventLog.dir               hdfs://namenode:8021/directory

spark.eventLog.enabled           true
spark.eventLog.dir               file:///some/where

运行火花提交为:

spark-submit --conf "spark.eventLog.enabled=true" --conf "spark.eventLog.dir=file:///tmp/test" --master spark://server:7077 examples/src/main/python/pi.py

与环境变量开始火花:

SPARK_DAEMON_JAVA_OPTS="-Dspark.eventLog.enabled=true -Dspark.history.fs.logDirectory=$sparkHistoryDir -Dspark.history.provider=org.apache.spark.deploy.history.FsHistoryProvider -Dspark.history.fs.cleaner.enabled=true -Dspark.history.fs.cleaner.interval=2d"

和只是为了矫枉过正:

SPARK_HISTORY_OPTS="-Dspark.eventLog.enabled=true -Dspark.history.fs.logDirectory=$sparkHistoryDir -Dspark.history.provider=org.apache.spark.deploy.history.FsHistoryProvider -Dspark.history.fs.cleaner.enabled=true -Dspark.history.fs.cleaner.interval=2d"

在哪里以及如何必须这些东西被设定为获得任意工作历史?

Where and how must these things be set to get history on arbitrary jobs?

推荐答案

我解决了这个问题,但奇怪的是我以前试过这样...
尽管如此,现在它似乎是一个稳定的解决方案:

I solved the problem, yet strangely I had tried this before... All the same, now it seems like a stable solution:

创建日志记录 HDFS 的目录,比如 / EVENTLOGGING

Create a directory in HDFS for logging, say /eventLogging

HDFS DFS -mkdir / EVENTLOGGING

然后火花壳火花提交(或其他)可以用下面的选项来运行: - 配置spark.eventLog.enabled =真--conf spark.eventLog.dir = HDFS://< hdfsNameNodeAddress>:8020 / EVENTLOGGING

Then spark-shell or spark-submit (or whatever) can be run with the following options: --conf spark.eventLog.enabled=true --conf spark.eventLog.dir=hdfs://<hdfsNameNodeAddress>:8020/eventLogging

如:

火花壳--conf spark.eventLog.enabled =真--conf spark.eventLog.dir = HDFS://&LT; hdfsNameNodeAddress&GT;:8020 / EVENTLOGGING

这篇关于Apache的火花:在设置spark.eventLog.enabled和spark.eventLog.dir提交或启动星火的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆