与json4s JSON解析时引发不可序列异常 [英] Spark non-serializable exception when parsing JSON with json4s

查看:1196
本文介绍了与json4s JSON解析时引发不可序列异常的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用我的火花的工作来解析JSON遇到的问题。我使用火花1.1.0 json4s 卡桑德拉星火连接器。引发的异常是:

I've run into an issue with attempting to parse json in my spark job. I'm using spark 1.1.0, json4s, and the Cassandra Spark Connector. The exception thrown is:

java.io.NotSerializableException:org.json4s.DefaultFormats

审视DefaultFormats同伴对象,以及与此<一个href=\"http://stackoverflow.com/questions/24786377/notserializableexception-with-json4s-on-spark\">stack问题,很显然,DefaultFormats无法序列。现在的问题是该怎么做。

Examining the DefaultFormats companion object, and with this stack question, it is clear that DefaultFormats cannot be serialized. The question is now what to do.

我可以看到这个显然火花code基解决了这个问题,通过添加关键字短暂的,但我不知道到底如何或在哪里把它应用到我的情况。是只实例化执行人的DefaultFormats类,避免序列化结合在一起的解决方案?是否有斯卡拉/火花另一个JSON解析库人在使用?我最初尝试使用杰克逊本身,而是跑进带注释的一些错误,我不能轻易解决,并制定json4s开箱。这里是我的code:

I can see this ticket has apparently addressed this issue in the spark code base, by adding the keyword transient, yet I am not sure exactly how or where to apply it to my case. Is the solution to only instantiate the DefaultFormats class on the executors, to avoid serialization all together? Is there another JSON parsing library for scala/spark that people are using? I initially tried using jackson by itself, but ran into some errors with annotations that I could not resolve easily, and json4s worked out of the box. Here is my code:

import org.json4s._
import org.json4s.jackson.JsonMethods._
implicit val formats = DefaultFormats

val count = rdd.map(r => checkUa(r._2, r._1)).reduce((x, y) => x + y) 

我做我的JSON解析在checkUa功能。我试图使计偷懒,因为它延缓执行莫名其妙的希望,但没有任何效果。也许里面移动的checkUA隐含VAL?任何意见多少AP preciated。

I do my json parsing in the checkUa function. I tried making count lazy, in hopes that it delay execution somehow, but it had no effect. Perhaps moving the implicit val inside checkUA? Any advice much appreciated.

推荐答案

这已经回答了的开放式票与json4s 。解决方法是把函数内的声明

This was already answered in an open ticket with json4s. The workaround is to put the implicit declaration inside of the function

val count = rdd
               .map(r => {implicit val formats = DefaultFormats; checkUa(r._2, r._1)})
               .reduce((x, y) => x + y) 

这篇关于与json4s JSON解析时引发不可序列异常的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆