如何在Spark中向Kryo注册InternalRow [英] How to register InternalRow with Kryo in Spark
问题描述
我想运行带有Kryo序列化的Spark.因此,我设置了spark.serializer=org.apache.spark.serializer.KryoSerializer
和spark.kryo.registrationRequired=true
I want to run Spark with Kryo serialisation. Therefore I set spark.serializer=org.apache.spark.serializer.KryoSerializer
and spark.kryo.registrationRequired=true
然后我运行代码时,出现错误:
When I then run my code I get the error:
未注册类:org.apache.spark.sql.catalyst.InternalRow []
Class is not registered: org.apache.spark.sql.catalyst.InternalRow[]
根据
sc.getConf.registerKryoClasses(Array( classOf[ org.apache.spark.sql.catalyst.InternalRow[_] ] ))
但是错误是:
org.apache.spark.sql.catalyst.InternalRow不使用类型参数
org.apache.spark.sql.catalyst.InternalRow does not take type parameters
推荐答案
您应将外部类用作
class MyRegistrator extends KryoRegistrator {
override def registerClasses(kryo: Kryo) {
kryo.register(classOf[Array[org.apache.spark.sql.catalyst.InternalRow]])
}
}
源: http://spark.apache.org/docs/0.6 .0/tuning.html
或者如果您想在Spark类中注册
Or if you want to register in your spark class
val cls: Class[Array[InternalRow]] = classOf[Array[org.apache.spark.sql.catalyst.InternalRow]]
spark.sparkContext.getConf.registerKryoClasses(Array(cls))
我使用第一个,并且运行良好,我还没有测试第二个.
I use the first one and works perfectly, I haven't tested the second one.
这篇关于如何在Spark中向Kryo注册InternalRow的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!