如何在没有SparkSQL的情况下使用FasterXML解析Spark中的JSON? [英] How to parse JSON in Spark with fasterxml without SparkSQL?

查看:136
本文介绍了如何在没有SparkSQL的情况下使用FasterXML解析Spark中的JSON?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我到这为止了

import com.fasterxml.jackson.module.scala.DefaultScalaModule
import com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.DeserializationFeature

case class Person(name: String, lovesPandas: Boolean)

val mapper = new ObjectMapper()

val input = sc.textFile("files/pandainfo.json")
val result = input.flatMap(record => {
    try{
        Some(mapper.readValue(record, classOf[Person]))
    } catch {
        case e: Exception => None
    }
})
result.collect

,但结果为Array()(无错误).该文件为 https://github.com/databricks/learning -spark/blob/master/files/pandainfo.json 如何从这里继续?

but get Array() as a result (with no error). The file is https://github.com/databricks/learning-spark/blob/master/files/pandainfo.json How do I go on from here?

咨询 Spark:广播杰克逊ObjectMapper 我尝试

import org.apache.spark._
import com.fasterxml.jackson.module.scala.DefaultScalaModule
import com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper
import com.fasterxml.jackson.databind.ObjectMapper
import com.fasterxml.jackson.databind.DeserializationFeature

case class Person(name: String, lovesPandas: Boolean)

val input = """{"name":"Sparky The Bear", "lovesPandas":true}"""
val result = input.flatMap(record => {
    try{
        val mapper = new ObjectMapper()
        mapper.registerModule(DefaultScalaModule)
        mapper.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
        Some(mapper.readValue(record, classOf[Person]))
    } catch {
        case e: Exception => None
    }
})
result.collect

得到了

Name: Compile Error
Message: <console>:34: error: overloaded method value readValue with alternatives:
  [T](x$1: Array[Byte], x$2: com.fasterxml.jackson.databind.JavaType)T <and>
  [T](x$1: Array[Byte], x$2: com.fasterxml.jackson.core.type.TypeReference[_])T <and>
  [T](x$1: Array[Byte], x$2: Class[T])T <and>

推荐答案

我看到您尝试了Learning Spark示例. 这里参考完整的代码 https ://github.com/holdenk/learning-spark-examples/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/BasicParseJsonWithJackson.scala E.

I see that you tried the Learning Spark examples. Here the reference to the complete code https://github.com/holdenk/learning-spark-examples/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/BasicParseJsonWithJackson.scala E.

这篇关于如何在没有SparkSQL的情况下使用FasterXML解析Spark中的JSON?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆