为什么 Spark 应用程序在 spark-shell 中工作但因“org.apache.spark.SparkException: Task not serializable"而失败?在 Eclipse 中? [英] Why does Spark application work in spark-shell but fail with "org.apache.spark.SparkException: Task not serializable" in Eclipse?
问题描述
为了将文件(以 | 分隔)保存到 DataFrame 中,我开发了下一个代码:
With the purpose of save a file (delimited by |) into a DataFrame, I have developed the next code:
val file = sc.textFile("path/file/")
val rddFile = file.map(a => a.split("\\|")).map(x => ArchivoProcesar(x(0), x(1), x(2), x(3))
val dfInsumos = rddFile.toDF()
用于创建我的 DataFrame 的案例类定义如下:
My case class used for the creation of my DataFrame is defined as followed:
case class ArchivoProcesar(nombre_insumo: String, tipo_rep: String, validado: String, Cargado: String)
我使用 spark-shell 进行了一些功能测试,我的代码运行良好,正确生成了 DataFrame.但是当我在 Eclipse 中执行我的程序时,它抛出了下一个错误:
I have done some functional tests using spark-shell, and my code works fine, generating the DataFrame correctly. But when I executed my program into eclipse, it throws me the next error:
我正在使用并与 eclipse 一起运行的 Scala 类中是否缺少某些东西.或者我的函数在 spark-shell 中正常运行但在 eclipse 应用程序中不能正常运行的原因是什么?
Is it something missing inside my scala class that I'm using and running with eclipse. Or what could be the reason that my functions works correctly at the spark-shell but not in my eclipse app?
问候.
推荐答案
您的案例类必须具有公共范围.您不能在类中使用 ArchivoProcesar
Your case class must have public scope. You can't have ArchivoProcesar inside a class
这篇关于为什么 Spark 应用程序在 spark-shell 中工作但因“org.apache.spark.SparkException: Task not serializable"而失败?在 Eclipse 中?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!