如何为独立群集非hdfs模式启用火花历史记录服务器 [英] How to enable spark-history server for standalone cluster non hdfs mode

查看:52
本文介绍了如何为独立群集非hdfs模式启用火花历史记录服务器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在

I have setup Spark2.1.1 cluster (1 master 2 slaves) following http://paxcel.net/blog/how-to-setup-apache-spark-standalone-cluster-on-multiple-machine/ in standalone mode. I do not have a pre-Hadoop setup on of the machine. I wanted to start spark-history server. I run it as follows:

roshan@bolt:~/spark/spark_home/sbin$ ./start-history-server.sh

,在spark-defaults.conf中,我将其设置为:

and in the spark-defaults.conf I set this:

spark.eventLog.enabled           true

但失败并显示以下错误:

But it fails with the error:

7/06/29 22:59:03 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(roshan); groups with view permissions: Set(); users  with modify permissions: Set(roshan); groups with modify permissions: Set()
17/06/29 22:59:03 INFO FsHistoryProvider: History server ui acls disabled; users with admin permissions: ; groups with admin permissions
Exception in thread "main" java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.spark.deploy.history.HistoryServer$.main(HistoryServer.scala:278)
    at org.apache.spark.deploy.history.HistoryServer.main(HistoryServer.scala)
Caused by: java.io.FileNotFoundException: Log directory specified does not exist: file:/tmp/spark-events Did you configure the correct one through spark.history.fs.logDirectory?
    at org.apache.spark.deploy.history.FsHistoryProvider.org$apache$spark$deploy$history$FsHistoryProvider$$startPolling(FsHistoryProvider.scala:214)

我应该设置为 spark.history.fs.logDirectory spark.eventLog.dir

更新1:

spark.eventLog.enabled           true
spark.history.fs.logDirectory   file:////home/roshan/spark/spark_home/logs
spark.eventLog.dir               file:////home/roshan/spark/spark_home/logs

但是我总是收到此错误:

but I am always getting this error:

java.lang.IllegalArgumentException: Codec [1] is not available. Consider setting spark.io.compression.codec=snappy at org.apache.spark.io.Co

推荐答案

默认情况下,spark将 file:/tmp/spark-events 定义为历史服务器的日志目录,并且您的日志中明确显示 spark.history.fs.logDirectory 未配置

By default spark defines file:/tmp/spark-events as the log directory for history server and your log clearly says spark.history.fs.logDirectory is not configured

首先,您需要在/tmp 中创建 spark-events 文件夹(这不是一个好主意,因为刷新了/tmp 每次重新启动计算机时),然后在 spark-defaults.conf 中添加 spark.history.fs.logDirectory 指向该目录.但是我建议您创建另一个文件夹,以使用户可以访问和更新 spark-defaults.conf 文件.

first of all you need to create spark-events folder in /tmp (which is not a good idea as /tmp is refreshed everytime a machine is rebooted) and then add spark.history.fs.logDirectory in spark-defaults.conf to point to that directory. But I suggest you create another folder which spark user has access to and update spark-defaults.conf file.

您需要在 spark-defaults.conf 文件中定义另外两个变量

You need to define two more variables in spark-defaults.conf file

spark.eventLog.dir              file:path to where you want to store your logs
spark.history.fs.logDirectory   file:same path as above

假设您要存储在/opt/spark-events 中,其中 spark用户可以访问 spark-defaults.conf 将会是

Suppose you want to store in /opt/spark-events where spark user has access to then above parameters in spark-defaults.conf would be

spark.eventLog.enabled          true
spark.eventLog.dir              file:/opt/spark-events
spark.history.fs.logDirectory   file:/opt/spark-events

您可以在监视和检测

这篇关于如何为独立群集非hdfs模式启用火花历史记录服务器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆