Pyspark上saveAsTextFile()中命令字符串异常中的(null)条目 [英] (null) entry in command string exception in saveAsTextFile() on Pyspark

查看：43 发布时间：2020/4/25 6:26:43 apache-spark pyspark jupyter-notebook

本文介绍了Pyspark上saveAsTextFile()中命令字符串异常中的(null)条目的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在Windows 7的Jupyter笔记本(Python 2.7)上的PySpark中工作.我有一个pyspark.rdd.PipelinedRDD类型的RDD，称为idSums.尝试执行idSums.saveAsTextFile("Output")时，出现以下错误:

I am working in PySpark on a Jupyter notebook (Python 2.7) in windows 7. I have an RDD of type pyspark.rdd.PipelinedRDD called idSums. When attempting to execute idSums.saveAsTextFile("Output"), I receive the following error:

Py4JJavaError: An error occurred while calling o834.saveAsTextFile.
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 in stage 33.0 failed 1 times, most recent failure: Lost task 1.0 in stage 33.0 (TID 131, localhost): java.io.IOException: (null) entry in command string: null chmod 0644 C:\Users\seride\Desktop\Experiments\PySpark\Output\_temporary\0\_temporary\attempt_201611231307_0033_m_000001_131\part-00001

我认为RDD对象应该没有任何问题，因为我能够执行其他操作而不会出错，例如执行idSums.collect()会产生正确的输出.

There shouldn't be any problem with the RDD object, in my opinion, because I'm able to execute other actions without error, e.g. executing idSums.collect() produces the correct output.

此外，创建了Output目录(包含所有子目录)并创建了文件part-00001，但文件大小为0个字节.

Furthermore, the Output directory is created (with all subdirectories) and the file part-00001 is created, but it is 0 bytes.

推荐答案

您丢失了 winutils.exe hadoop二进制文件.取决于x64位/x32位系统，下载 winutils.exe 文件&将您的hadoop回家指向它.

You are missing winutils.exe a hadoop binary . Depending upon x64 bit / x32 bit System download the winutils.exe file & set your hadoop home pointing to it.

第一种方式:

下载文件
在系统中创建hadoop文件夹，例如C:
在hadoop目录中创建bin文件夹，例如:C:\hadoop\bin
在bin中粘贴winutils.exe，例如:C:\hadoop\bin\winuitls.exe
在系统属性的用户变量中->高级系统设置

Download the file
Create hadoop folder in Your System, ex C:
Create bin folder in hadoop directory, ex : C:\hadoop\bin
paste winutils.exe in bin, ex: C:\hadoop\bin\winuitls.exe
In User Variables in System Properties -> Advance System Settings

创建新变量名称:HADOOP_HOME 路径:C:\hadoop\

Create New Variable Name: HADOOP_HOME Path: C:\hadoop\

第二种方式:

您可以使用以下命令直接在Java程序中设置hadoop主页:

You can set hadoop home directly in Your Java Program with the following Command like this :

System.setProperty("hadoop.home.dir","C:\hadoop" );

这篇关于Pyspark上saveAsTextFile()中命令字符串异常中的(null)条目的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Pyspark上saveAsTextFile()中命令字符串异常中的(null)条目 [英] (null) entry in command string exception in saveAsTextFile() on Pyspark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Pyspark上saveAsTextFile()中命令字符串异常中的(null)条目 [英] (null) entry in command string exception in saveAsTextFile() on Pyspark

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭