如何在没有SparkSQL的情况下使用FasterXML解析Spark中的JSON? [英] How to parse JSON in Spark with fasterxml without SparkSQL?
问题描述
我到这为止了
import com.fasterxml.jackson.module.scala.DefaultScalaModule
import com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.DeserializationFeature
case class Person(name: String, lovesPandas: Boolean)
val mapper = new ObjectMapper()
val input = sc.textFile("files/pandainfo.json")
val result = input.flatMap(record => {
try{
Some(mapper.readValue(record, classOf[Person]))
} catch {
case e: Exception => None
}
})
result.collect
,但结果为Array()
(无错误).该文件为 https://github.com/databricks/learning -spark/blob/master/files/pandainfo.json 如何从这里继续?
but get Array()
as a result (with no error). The file is https://github.com/databricks/learning-spark/blob/master/files/pandainfo.json How do I go on from here?
咨询 Spark:广播杰克逊ObjectMapper 我尝试
import org.apache.spark._
import com.fasterxml.jackson.module.scala.DefaultScalaModule
import com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.DeserializationFeature
case class Person(name: String, lovesPandas: Boolean)
val input = """{"name":"Sparky The Bear", "lovesPandas":true}"""
val result = input.flatMap(record => {
try{
val mapper = new ObjectMapper()
mapper.registerModule(DefaultScalaModule)
mapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
Some(mapper.readValue(record, classOf[Person]))
} catch {
case e: Exception => None
}
})
result.collect
得到了
Name: Compile Error
Message: <console>:34: error: overloaded method value readValue with alternatives:
[T](x$1: Array[Byte], x$2: com.fasterxml.jackson.databind.JavaType)T <and>
[T](x$1: Array[Byte], x$2: com.fasterxml.jackson.core.type.TypeReference[_])T <and>
[T](x$1: Array[Byte], x$2: Class[T])T <and>
推荐答案
我看到您尝试了Learning Spark示例. 这里参考完整的代码 https ://github.com/holdenk/learning-spark-examples/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/BasicParseJsonWithJackson.scala E.
I see that you tried the Learning Spark examples. Here the reference to the complete code https://github.com/holdenk/learning-spark-examples/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/BasicParseJsonWithJackson.scala E.
这篇关于如何在没有SparkSQL的情况下使用FasterXML解析Spark中的JSON?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!