为什么scala.beans.beanproperty在Spark中的工作方式有所不同 [英] Why does scala.beans.beanproperty work differently in spark

查看:139
本文介绍了为什么scala.beans.beanproperty在Spark中的工作方式有所不同的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在scala REPL中,以下代码

In a scala REPL the following code

import scala.beans.BeanProperty

class EmailAccount {
  @scala.beans.BeanProperty var accountName: String = null

  override def toString: String = {
    return s"acct ($accountName)"
  }
}
classOf[EmailAccount].getDeclaredConstructor()

结果

res0: java.lang.reflect.Constructor[EmailAccount] = public EmailAccount()

但是在Spark的REPL中我得到了

however in spark's REPL I get

java.lang.NoSuchMethodException: EmailAccount.<init>()
  at java.lang.Class.getConstructor0(Class.java:2810)
  at java.lang.Class.getDeclaredConstructor(Class.java:2053)
  ... 48 elided

是什么原因导致这种差异?如何获得火花以匹配火花壳的行为.

What causes this discrepancy? How can I get spark to match the behavior of the spark shell.

我像这样启动REPL:

I launched the REPLs like so:

/home/placey/Downloads/spark-2.0.0-bin-hadoop2.7/bin/spark-shell --master local --jars /home/placey/snakeyaml-1.17.jar

scala -classpath "/home/placey/snakeyaml-1.17.jar

Scala版本是 火花:

Scala versions are spark:

Using Scala version 2.11.8 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_55)

scala:

Welcome to Scala version 2.11.6 (Java HotSpot(TM) 64-Bit Server VM, Java 1.7.0_55).

推荐答案

实际上,这并非特定于scala.beans.BeanProperty甚至Spark.通过使用-Yrepl-class-based参数

Actually, this isn't specific to scala.beans.BeanProperty or even Spark. You can get the same behaviour in standard Scala REPL by running it with -Yrepl-class-based parameter:

scala -Yrepl-class-based

现在,让我们尝试定义一个简单的空类:

Now, let's try defining a simple empty class:

scala> class Foo()
defined class Foo

scala> classOf[Foo].getConstructors
res0: Array[java.lang.reflect.Constructor[_]] = Array(public Foo($iw))

scala> classOf[Foo].getFields
res1: Array[java.lang.reflect.Field] = Array(public final $iw Foo.$outer)

如您所见,REPL通过向构造函数添加其他字段和参数来动态修改您的类.为什么?

As you can see, the REPL modified your class on the fly by adding additional field and parameter to the constructor. Why?

每当在Scala REPL中创建valvar时,它都会被包裹在一个特殊的对象中,因为在Scala中没有全局变量"之类的东西.参见此答案.

Whenever you create a val or var in Scala REPL, it gets wrapped in a special object, because there's no such thing as "global variables" in Scala. See this answer.

通常,这是一个对象,因此全局可用.但是,对于-Yrepl-class-based,REPL使用类实例而不是单个全局对象.此功能由Spark开发人员引入,因为Spark需要将类进行可序列化,以便可以将其发送给远程工作者(请参见

Normally, this is an object, so it's available globally. However, with -Yrepl-class-based the REPL uses class instances instead of a single global object. This feature was introduced by Spark developers because Spark needs classes to be serializable so they can be sent to a remote worker (see this pull request).

因此,您在REPL中定义的任何类都需要获取$iw实例.否则,您将无法访问在REPL中定义的全局valvar.此外,生成的类会自动扩展Serializable.

Because of this, any class you define in the REPL needs to get the $iw instance. Otherwise you wouldn't be able to access global vals and vars which you defined in the REPL. Additionally, the generated class automatically extends Serializable.

恐怕您无法采取任何措施来阻止这种情况. spark-shell默认情况下启用-Yrepl-class-based.即使可以禁用此行为,您也会遇到很多其他问题,因为您的类将不再可序列化,但是Spark需要对其进行序列化.

I'm afraid you can't do anything to prevent this. spark-shell enables -Yrepl-class-based by default. Even if there was an option for disabling this behaviour, you would run into many other problems because your classes would no longer be serializable, but Spark needs to serialize them.

这篇关于为什么scala.beans.beanproperty在Spark中的工作方式有所不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆