类型不匹配具有相同类型的星火壳 [英] Type mismatch with identical types in Spark-shell

查看:195
本文介绍了类型不匹配具有相同类型的星火壳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有围绕打造火花外壳脚本的工作流程,但我常常奇怪的类型不匹配(可能是从斯卡拉REPL继承)争论不休具有相同的发现和所需类型存在的。下面的例子说明了这个问题。执行在粘贴模式,没有问题的。

 斯卡拉> :糊
//进入粘贴模式(CTRL-D来完成)
进口org.apache.spark.rdd.RDD
案例类C(S:字符串)
DEF F(R:RDD [C]):字符串=你好
VAL在= sc.parallelize(名单(C(HI)))
鳍)//退出粘贴模式,目前除preting。进口org.apache.spark.rdd.RDD
定义的C类
F:(R:org.apache.spark.rdd.RDD [C])字符串
在:org.apache.spark.rdd.RDD [C] = ParallelCollectionRDD [0]并行化AT<&控制台GT;:13
RES0:字符串=你好

 斯卡拉>鳍)
<&控制台GT;:29:错误:类型不匹配;
 发现:org.apache.spark.rdd.RDD [C]
 要求:org.apache.spark.rdd.RDD [C]
              鳍)
                ^

有相关的讨论有关斯卡拉REPL 和<一个href=\"https://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/Simple-Scala-$c$c-not-working-in-Spark-shell-repl/td-p/16564\"相对=nofollow>有关火花壳
但提到问题似乎不相关的(和解决)给我。

此问题会导致严重的问题写差强人意code是在REPL交互执行,或导致失去大部分的在REPL工作,开始的优势。有没有解决的办法? (和/或它是一个已知的问题?)

编辑:

问题发生火花1.2和1.3.0。在使用Scala 2.10.4火花1.3.0测试取得

看来,至少在测试中,分别用类从壳体类定义的重复语句,缓解这一问题

 斯卡拉&GT; :糊
//进入粘贴模式(CTRL-D来完成)
DEF F(R:RDD [C]):字符串=你好
VAL在= sc.parallelize(名单(C(HI1)))//退出粘贴模式,目前除preting。F:(R:org.apache.spark.rdd.RDD [C])字符串
在:org.apache.spark.rdd.RDD [C] = ParallelCollectionRDD [1]在并行化AT&LT;&控制台GT;:26斯卡拉&GT;鳍)
RES2:字符串=你好


解决方案

不幸的是,这仍是一个的开放式问题。火花壳code被包裹在类中,这会导致<一个href=\"http://stackoverflow.com/questions/29762964/task-not-serializable-when-using-object-in-repl\">strange有时行为的。

另一个问题:像值reduceByKey错误不是org.apache.spark.rdd.RDD的成员[(...,...)] CAN通过在同一个项目使用不同版本的火花引起的。如果您使用的IntelliJ,去的文件 - >项目结构 - >库的删除,如东西的 SBT:org.apache.spark:火花catalyst_2.10: 1.1。 0 :JAR 的。你需要与火花的版本1.2.0或1.3.0库。

希望这将有助于你以某种方式。

I have build a scripting workflow around the spark-shell but I'm often vexed by bizarre type mismatches (probably inherited from the scala repl) occuring with identical found and required types. The following example illustrates the problem. Executed in paste mode, no problem

scala> :paste
// Entering paste mode (ctrl-D to finish)


import org.apache.spark.rdd.RDD
case class C(S:String)
def f(r:RDD[C]): String = "hello"
val in = sc.parallelize(List(C("hi")))
f(in)

// Exiting paste mode, now interpreting.

import org.apache.spark.rdd.RDD
defined class C
f: (r: org.apache.spark.rdd.RDD[C])String
in: org.apache.spark.rdd.RDD[C] = ParallelCollectionRDD[0] at parallelize at <console>:13
res0: String = hello

but

scala> f(in)
<console>:29: error: type mismatch;
 found   : org.apache.spark.rdd.RDD[C]
 required: org.apache.spark.rdd.RDD[C]
              f(in)
                ^ 

There are related discussion about the scala repl and about the spark-shell but the mentioned issue seems unrelated (and resolved) to me.

This problem causes serious problems for writing passable code to be executed interactively in the repl, or causes to lose most of the advantage of working in a repl to begin with. Is there a solution? (And/or is it a known issue?)

Edits:

Problems occured with spark 1.2 and 1.3.0. Test made on spark 1.3.0 using scala 2.10.4

It seems that, at least in the test, repeating the statement using the class separately from the case class definition, mitigate the problem

scala> :paste
// Entering paste mode (ctrl-D to finish)


def f(r:RDD[C]): String = "hello"
val in = sc.parallelize(List(C("hi1")))

// Exiting paste mode, now interpreting.

f: (r: org.apache.spark.rdd.RDD[C])String
in: org.apache.spark.rdd.RDD[C] = ParallelCollectionRDD[1] at parallelize at <console>:26

scala> f(in)
res2: String = hello

解决方案

Unfortunately, this is still an open issue. Code in spark-shell is wrapped in classes and it causes strange behavior sometimes.

The other problem: Errors like value reduceByKey is not a member of org.apache.spark.rdd.RDD[(...,...)] can be caused by using different versions of spark in the same project. If you use IntelliJ, go to File -> Project Structure -> Libraries and delete the stuff like "SBT: org.apache.spark:spark-catalyst_2.10:1.1.0:jar". You need libs with spark's version 1.2.0 or 1.3.0.

Hope it will help you somehow.

这篇关于类型不匹配具有相同类型的星火壳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆