带有“无可用类型标签"的 Scala/Spark 应用程序“def main"错误风格应用 [英] Scala/Spark App with "No TypeTag available" Error in "def main" style App

查看:19
本文介绍了带有“无可用类型标签"的 Scala/Spark 应用程序“def main"错误风格应用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 Scala/Spark 堆栈的新手,我正在尝试弄清楚如何使用 SparkSql 测试我的基本技能,以映射"TempTables 中的 RDD,反之亦然.

I'm new to Scala/Spark stack and I'm trying to figure out how to test my basic skills using SparkSql to "map" RDDs in TempTables and viceversa.

我有两个具有相同代码的不同 .scala 文件:一个简单的对象(带有 def main...)和一个扩展 App 的对象.

I have 2 distinct .scala files with the same code: a simple object (with def main...) and an object extending App.

在一个简单的对象中,我收到一个错误,因为没有可用的类型标签"连接到我的案例类日志:

In the simple object one I get an error due to "No TypeTag available" connected to my case class Log:

object counter {
  def main(args: Array[String]) {
.
.
.
   val sqlContext = new org.apache.spark.sql.SQLContext(sc)
   import sqlContext.createSchemaRDD
   case class Log(visitatore: String, data: java.util.Date, pagina: String, count: Int)
   val log = triple.map(p => Log(p._1,p._2,p._3,p._4))
   log.registerTempTable("logs")
   val logSessioni= sqlContext.sql("SELECT visitor, data, pagina, count FROM logs")
   logSessioni.foreach(println)
}

行中的错误:log.registerTempTable("logs") 说没有可用于日志的 TypeTag".

The error at line: log.registerTempTable("logs") says "No TypeTag available for Log".

在另一个文件(对象扩展应用程序)中一切正常:

In the other file (object extends App) all works fine:

object counterApp extends App {
.
.
.
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
    import sqlContext.createSchemaRDD
    case class Log(visitatore: String, data: java.util.Date, pagina: String, count: Int)
    val log = triple.map(p => Log(p._1,p._2,p._3,p._4))
    log.registerTempTable("logs")
    val logSessioni= sqlContext.sql("SELECT visitor, data, pagina, count from logs")
    logSessioni.foreach(println)
}

因为我刚刚开始,我没有得到两个要点:1) 为什么相同的代码在第二个文件(对象扩展应用程序)中可以正常工作,而在第一个文件(简单对象)中却出现错误?

Since I've just started, I'm not getting two main points: 1) Why does the same code work fine in the second file (object extend App) while in the first one (simple object) I get the error?

2)(也是最重要的)我应该在我的代码(简单的目标文件)中做什么来修复这个错误,以便处理案例类和 TypeTag(我几乎不知道)?

2) (and most important) What should I do in my code (simple object file) to fix this error in order to deal with case class and TypeTag (which I barely know)?

每个答案,代码示例都将不胜感激!

Every answer, code examples will be much appreciated!

提前致谢

FF

推荐答案

TL;DR;

只需将您的案例类移出方法定义

Just move your case class out of the method definition

问题在于您的 case class Log 是在它正在使用的方法内定义的.因此,只需将您的案例类定义移到方法之外,它就会起作用.我将不得不看看它是如何编译的,但我的猜测是这更像是一个鸡蛋问题.TypeTag(用于反射)不能被隐式定义,因为它当时还没有完全定义.这里是两个SO 问题 具有相同的问题,表明 Spark 需要使用 WeakTypeTag.而且,这里是 JIRA 更正式地解释了这一点

The problem is that your case class Log is defined inside of the method that it is being used. So, simply move your case class definition outside of the method and it will work. I will have to take a look at how this compiles down, but my guess is that this is more of a chicken-egg problem. The TypeTag (used for reflection) is not able to be implicitly defined as it has not been fully defined at that point. Here are two SO questions with the same problem that exhibit that Spark would need to use a WeakTypeTag. And, here is the JIRA explaining this more officially

这篇关于带有“无可用类型标签"的 Scala/Spark 应用程序“def main"错误风格应用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆