为什么Spark应用程序在spark-shell中工作,但是以“org.apache.spark.SparkException:Task not serializable”无效。在Eclipse中? [英] Why does Spark application work in spark-shell but fail with "org.apache.spark.SparkException: Task not serializable" in Eclipse?
问题描述
为了将文件(由...分隔)保存到DataFrame中,我开发了下一个代码:
val file = sc.textFile(path / file /)
pre>
val rddFile = file.map(a => a.split(\\))。map(x => ArchivoProcesar (x(0),x(1),x(2),x(3))
val dfInsumos = rddFile.toDF()
我用于创建DataFrame的case类定义如下:
code> case class ArchivoProcesar(nombre_insumo:String,tipo_rep:String,validado:String,Cargado:String)
我使用 spark-shell 进行了一些功能测试,我的代码正常工作,正确生成DataFrame,但是当我将程序执行到eclipse时,会抛出下一个错误: / p>
是吗我正在使用和运行eclipse的scala类中缺少的东西。或者可能是我的功能在 spark-shell 正常工作的原因,但不在我的eclipse应用程序中?
。
解决方案您的案例类必须具有公共范围。您不能将ArchivoProcesar放在课程
之内With the purpose of save a file (delimited by |) into a DataFrame, I have developed the next code:
val file = sc.textFile("path/file/") val rddFile = file.map(a => a.split("\\|")).map(x => ArchivoProcesar(x(0), x(1), x(2), x(3)) val dfInsumos = rddFile.toDF()
My case class used for the creation of my DataFrame is defined as followed:
case class ArchivoProcesar(nombre_insumo: String, tipo_rep: String, validado: String, Cargado: String)
I have done some functional tests using spark-shell, and my code works fine, generating the DataFrame correctly. But when I executed my program into eclipse, it throws me the next error:
Is it something missing inside my scala class that I'm using and running with eclipse. Or what could be the reason that my functions works correctly at the spark-shell but not in my eclipse app?
Regards.
解决方案Your case class must have public scope. You can't have ArchivoProcesar inside a class
这篇关于为什么Spark应用程序在spark-shell中工作,但是以“org.apache.spark.SparkException:Task not serializable”无效。在Eclipse中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!