Spark 错误:线程“main"中的异常java.lang.UnsupportedOperationException [英] Spark error: Exception in thread "main" java.lang.UnsupportedOperationException
问题描述
我正在编写一个 Scala/spark 程序,它可以找到员工的最高工资.员工数据在 CSV 文件中可用,薪水列有一个逗号分隔符,它还有一个 $ 前缀,例如74,628.00 美元.
I am writing a Scala/spark program which would find the max salary of the employee. The employee data is available in a CSV file, and the salary column has a comma separator for thousands and also it has a $ prefixed to it e.g. $74,628.00.
为了处理这个逗号和美元符号,我在 Scala 中编写了一个解析器函数,它将在,"上拆分每一行,然后将每一列映射到要分配给案例类的各个变量.
To handle this comma and dollar sign, I have written a parser function in scala which would split each line on "," and then map each column to individual variables to be assigned to a case class.
我的解析器程序如下所示.为了消除逗号和美元符号,我使用替换函数将其替换为空,然后最后将其键入为 Int.
My parser program looks like below. In this to eliminate the comma and dollar signs I am using the replace function to replace it with empty, and then finally typecase to Int.
def ParseEmployee(line: String): Classes.Employee = {
val fields = line.split(",")
val Name = fields(0)
val JOBTITLE = fields(2)
val DEPARTMENT = fields(3)
val temp = fields(4)
temp.replace(",","")//To eliminate the ,
temp.replace("$","")//To remove the $
val EMPLOYEEANNUALSALARY = temp.toInt //Type cast the string to Int
Classes.Employee(Name, JOBTITLE, DEPARTMENT, EMPLOYEEANNUALSALARY)
}
我的案例类如下
case class Employee (Name: String,
JOBTITLE: String,
DEPARTMENT: String,
EMPLOYEEANNUALSALARY: Number,
)
我的 spark 数据框 sql 查询如下所示
My spark dataframe sql query looks like below
val empMaxSalaryValue = sc.sqlContext.sql("Select Max(EMPLOYEEANNUALSALARY) From EMP")
empMaxSalaryValue.show
当我运行这个程序时,我得到以下异常
when I Run this program I am getting this below exception
Exception in thread "main" java.lang.UnsupportedOperationException: No Encoder found for Number
- field (class: "java.lang.Number", name: "EMPLOYEEANNUALSALARY")
- root class: "Classes.Employee"
at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:625)
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$10.apply(ScalaReflection.scala:619)
at org.apache.spark.sql.catalyst.ScalaReflection$$anonfun$10.apply(ScalaReflection.scala:607)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241)
at scala.collection.immutable.List.foreach(List.scala:381)
at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241)
at scala.collection.immutable.List.flatMap(List.scala:344)
at org.apache.spark.sql.catalyst.ScalaReflection$.org$apache$spark$sql$catalyst$ScalaReflection$$serializerFor(ScalaReflection.scala:607)
at org.apache.spark.sql.catalyst.ScalaReflection$.serializerFor(ScalaReflection.scala:438)
at org.apache.spark.sql.catalyst.encoders.ExpressionEncoder$.apply(ExpressionEncoder.scala:71)
at org.apache.spark.sql.Encoders$.product(Encoders.scala:275)
at org.apache.spark.sql.SparkSession.createDataFrame(SparkSession.scala:282)
at org.apache.spark.sql.SQLContext.createDataFrame(SQLContext.scala:272)
at CalculateMaximumSalary$.main(CalculateMaximumSalary.scala:27)
at CalculateMaximumSalary.main(CalculateMaximumSalary.scala)
知道为什么我会收到这个错误吗?我在这里做的错误是什么,为什么它不能类型转换为数字?
Any idea why I am getting this error? what is the mistake I am doing here and why it is not able to typecast to number?
是否有更好的方法来处理获得员工最高工资的问题?
Is there any better approach to handle this problem of getting maximum salary of the employee?
推荐答案
Spark SQL 仅提供数量有限的 Encoders
以具体类为目标.不支持像 Number
这样的抽象类(可以与有限的二进制 Encoders
一起使用).
Spark SQL provides only a limited number of Encoders
which target concrete classes. Abstract classes like Number
are not supported (can be used with limited binary Encoders
).
既然你无论如何都转换为Int
,只需重新定义类:
Since you convert to Int
anyway, just redefine the class:
case class Employee (
Name: String,
JOBTITLE: String,
DEPARTMENT: String,
EMPLOYEEANNUALSALARY: Int
)
这篇关于Spark 错误:线程“main"中的异常java.lang.UnsupportedOperationException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!