Windows 上的 spark-shell 错误 - 如果不使用 hadoop 可以忽略吗? [英] spark-shell error on Windows - can it be ignored if not using hadoop?

查看:31
本文介绍了Windows 上的 spark-shell 错误 - 如果不使用 hadoop 可以忽略吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

启动 spark-shell 时出现以下错误.我将使用 Spark 处理 SQL Server 中的数据.我可以忽略这些错误吗?

I got the following error when starting the spark-shell. I'm going to use Spark to process data in SQL Server. Can I ignore the errors?

java.io.IOException: 无法在 Hadoop 二进制文件中找到可执行文件 null\bin\winutils.exe.

java.io.IOException: Could not locate executable null\bin\winutils.exe in the Hadoop binaries.

java.lang.IllegalArgumentException: 实例化 'org.apache.spark.sql.hive.HiveSessionState' 时出错

java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState'

引起:java.lang.reflect.InvocationTargetException:java.lang.IllegalArgumentException:实例化'org.apache.spark.sql.hive.HiveExternalCatalog'时出错:

Caused by: java.lang.reflect.InvocationTargetException: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog':

引起:java.lang.IllegalArgumentException:实例化'org.apache.spark.sql.hive.HiveExternalCatalog'时出错

Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog'

引起:java.lang.IllegalArgumentException:实例化'org.apache.spark.sql.hive.HiveExternalCatalog'时出错

Caused by: java.lang.IllegalArgumentException: Error while instantiating 'org.apache.spark.sql.hive.HiveExternalCatalog'

Caused by: java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) 命令字符串中的条目:null ls -FC:\tmp\hive

Caused by: java.lang.reflect.InvocationTargetException: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) entry in command string: null ls -F C:\tmp\hive

Caused by: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: Error while running command to get file permissions: java.io.IOException: (null) entry in command string: nullls -FC:\tmp\hive

Caused by: java.lang.reflect.InvocationTargetException: java.lang.RuntimeException: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) entry in command string: null ls -F C:\tmp\hive

引起:java.lang.RuntimeException: java.lang.RuntimeException: Error while running command to get file permissions: java.io.IOException: (null) entry in command string: null ls -FC:\tmp\hive

Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Error while running command to get file permissions : java.io.IOException: (null) entry in command string: null ls -F C:\tmp\hive

推荐答案

tl;dr 你宁愿不这样做.

嗯,这可能有可能,但鉴于您刚刚开始踏上 Spark 土地的旅程,这些努力不会有回报.

Well, it may be possible, but given you've just started your journey to Spark's land the efforts would not pay off.

Windows 对我来说从来都不是一个对开发人员友好的操作系统,每当我教人们 Spark 并且他们使用 Windows 时,我都认为我们必须通过 winutils.exe 设置是理所当然的但很多时候也是如何在命令行上工作.

Windows has never been a developer-friendly OS to me and whenever I teach people Spark and they use Windows I just take it as granted that we'll have to go through the winutils.exe setup but many times also how to work on command line.

请按如下方式安装winutils.exe:

  1. 以管理员身份运行cmd
  2. https://github.com/steveloughran/winutils 存储库下载 winutils.exe 二进制文件(使用hadoop-2.7.1 for Spark 2)
  3. 将 winutils.exe 二进制文件保存到您选择的目录,例如c:\hadoop\bin
  4. 设置 HADOOP_HOME 以反映带有 winutils.exe 的目录(不带 bin),例如设置 HADOOP_HOME=c:\hadoop
  5. 设置 PATH 环境变量以包含 %HADOOP_HOME%\bin
  6. 创建c:\tmp\hive目录
  7. 执行 winutils.exe chmod -R 777 \tmp\hive
  8. 打开 spark-shell 并运行 spark.range(1).show 以查看单行数据集.
  1. Run cmd as administrator
  2. Download winutils.exe binary from https://github.com/steveloughran/winutils repository (use hadoop-2.7.1 for Spark 2)
  3. Save winutils.exe binary to a directory of your choice, e.g. c:\hadoop\bin
  4. Set HADOOP_HOME to reflect the directory with winutils.exe (without bin), e.g. set HADOOP_HOME=c:\hadoop
  5. Set PATH environment variable to include %HADOOP_HOME%\bin
  6. Create c:\tmp\hive directory
  7. Execute winutils.exe chmod -R 777 \tmp\hive
  8. Open spark-shell and run spark.range(1).show to see a one-row dataset.

这篇关于Windows 上的 spark-shell 错误 - 如果不使用 hadoop 可以忽略吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆