NullPointerException异常斯卡拉星火,似乎BE集合类型造成的? [英] NullPointerException in Scala Spark, appears to be caused be collection type?

查看:199
本文介绍了NullPointerException异常斯卡拉星火,似乎BE集合类型造成的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

sessionIdList的类型是:

sessionIdList is of type :

斯卡拉> sessionIdList
res19:org.apache.spark.rdd.RDD [字符串] = MappedRDD [17]在不同处:30

scala> sessionIdList res19: org.apache.spark.rdd.RDD[String] = MappedRDD[17] at distinct at :30

当我尝试低于code运行:

When I try to run below code :

val x = sc.parallelize(List(1,2,3)) 
val cartesianComp = x.cartesian(x).map(x => (x))


  val kDistanceNeighbourhood = sessionIdList.map(s =>  
    {
        cartesianComp.filter(v => v != null)
    }

我收到异常:

14/05/21 16:20:46 ERROR Executor: Exception in task ID 80
java.lang.NullPointerException
        at org.apache.spark.rdd.RDD.filter(RDD.scala:261)
        at $line94.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:38)
        at $line94.$read$$iwC$$iwC$$iwC$$iwC$$anonfun$1.apply(<console>:36)
        at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        at scala.collection.Iterator$$anon$10.next(Iterator.scala:312)
        at scala.collection.Iterator$class.foreach(Iterator.scala:727)

        )
        kDistanceNeighbourhood.take(1)

但是,如果我使用:

However if I use :

  val l = sc.parallelize(List("1","2")) 
  val kDistanceNeighbourhood = l.map(s =>  
    {
        cartesianComp.filter(v => v != null)
    }
    )
    kDistanceNeighbourhood.take(1)

则显示也不例外

两个code片段之间的区别在于,在第​​一片断sessionIdList的类型为:

The difference between the two code snippets is that in first snippet sessionIdList is of type :

res19: org.apache.spark.rdd.RDD[String] = MappedRDD[17] at distinct at <console>:30

和第二片段L类型为

scala> l
res13: org.apache.spark.rdd.RDD[String] = ParallelCollectionRDD[32] at parallelize at <console>:12

这是为什么错误出现?

Why is this error occuring ?

我是否需要sessionIdList转换为ParallelCollectionRDD为了解决这一问题?

Do I need to convert sessionIdList to ParallelCollectionRDD in order to fix this ?

推荐答案

星火不支持RDDS(嵌套见的http:/ /stackoverflow.com/a/14130534/590203 对同一问题的另一个发生),所以你不能在RDDS执行其他操作RDD内转换或行动。

Spark doesn't support nesting of RDDs (see http://stackoverflow.com/a/14130534/590203 for another occurrence of the same problem), so you can't perform transformations or actions on RDDs inside of other RDD operations.

在第一种情况下,你看到通过时,它试图访问SparkContext对象,只是对司机present和工人抛出一个NullPointerException异常而不是工人。

In the first case, you're seeing a NullPointerException thrown by the worker when it tries to access a SparkContext object that's only present on the driver and not the workers.

在第二种情况下,我的预感是在作业上的驱动程序在本地运行和意外纯粹的工作。

In the second case, my hunch is the job was run locally on the driver and worked purely by accident.

这篇关于NullPointerException异常斯卡拉星火,似乎BE集合类型造成的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆