Windows上的spark-shell错误-如果不使用hadoop,可以忽略它吗? [英] spark-shell error on Windows - can it be ignored if not using hadoop?
问题描述
启动spark-shell时出现以下错误.我将使用Spark在SQL Server中处理数据.我可以忽略这些错误吗?
I got the following error when starting the spark-shell. I'm going to use Spark to process data in SQL Server. Can I ignore the errors?
java.io.IOException:无法在Hadoop二进制文件中找到可执行文件null \ bin \ winutils.exe.
java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.
java.lang.IllegalArgumentException:实例化org.apache.spark.sql.hive.HiveSessionState时出错
java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState'
原因:java.lang.reflect.InvocationTargetException:java.lang.IllegalArgumentException:实例化"org.apache.spark.sql.hive.HiveExternalCatalog"时出错:
Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':
由以下原因引起:java.lang.IllegalArgumentException:实例化org.apache.spark.sql.hive.HiveExternalCatalog时出错
Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog'
由以下原因引起:java.lang.IllegalArgumentException:实例化org.apache.spark.sql.hive.HiveExternalCatalog时出错
Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog'
原因:java.lang.reflect.InvocationTargetException:java.lang.reflect.InvocationTargetException:java.lang.RuntimeException:java.lang.RuntimeException:运行获取文件权限的命令时出错:java.io.IOException :( null)在命令字符串中输入:null ls -FC:\ tmp \ hive
Caused by: java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) entry in command string: null ls -F C:\tmp\hive
由以下原因导致:java.lang.reflect.InvocationTargetException:java.lang.RuntimeException:java.lang.RuntimeException:运行获取文件权限的命令时出错:java.io.IOException:命令字符串中的(空)条目:null ls -FC:\ tmp \ hive
Caused by: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) entry in command string: null ls -F C:\tmp\hive
由以下原因引起:java.lang.RuntimeException:java.lang.RuntimeException:运行获取文件权限的命令时出错:java.io.IOException:命令字符串中的(空)条目:null ls -FC:\ tmp \ hive
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) entry in command string: null ls -F C:\tmp\hive
推荐答案
tl; dr 您宁愿不要.
嗯,可能是可能,但是鉴于您刚刚开始了前往Spark土地的旅程,因此付出的努力不会奏效.
Well, it may be possible, but given you've just started your journey to Spark's land the efforts would not pay off.
对于我来说,Windows从来都不是对开发人员友好的操作系统,每当我教人Spark并且他们使用Windows时,我都认为这是必须的,我们必须进行winutils.exe
设置,但是很多时候也要如何工作在命令行上.
Windows has never been a developer-friendly OS to me and whenever I teach people Spark and they use Windows I just take it as granted that we'll have to go through the winutils.exe
setup but many times also how to work on command line.
请按以下步骤安装winutils.exe
:
- 以管理员身份运行
cmd
- 从 https://github.com/steveloughran/winutils 存储库中下载winutils.exe二进制文件(使用 hadoop-2.7.1 for Spark 2 )
- 将winutils.exe二进制文件保存到您选择的目录中,例如
c:\hadoop\bin
- 使用winutils.exe(无
bin
)设置HADOOP_HOME以反映目录,例如set HADOOP_HOME=c:\hadoop
- 设置PATH环境变量以包含
%HADOOP_HOME%\bin
- 创建
c:\tmp\hive
目录 - 执行
winutils.exe chmod -R 777 \tmp\hive
- 打开
spark-shell
并运行spark.range(1).show
以查看单行数据集.
- Run
cmd
as administrator - Download winutils.exe binary from https://github.com/steveloughran/winutils repository (use hadoop-2.7.1 for Spark 2)
- Save winutils.exe binary to a directory of your choice, e.g.
c:\hadoop\bin
- Set HADOOP_HOME to reflect the directory with winutils.exe (without
bin
), e.g.set HADOOP_HOME=c:\hadoop
- Set PATH environment variable to include
%HADOOP_HOME%\bin
- Create
c:\tmp\hive
directory - Execute
winutils.exe chmod -R 777 \tmp\hive
- Open
spark-shell
and runspark.range(1).show
to see a one-row dataset.
这篇关于Windows上的spark-shell错误-如果不使用hadoop,可以忽略它吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!