值reduceByKey不是org.apache.spark.rdd.RDD的成员 [英] value reduceByKey is not a member of org.apache.spark.rdd.RDD

查看:173
本文介绍了值reduceByKey不是org.apache.spark.rdd.RDD的成员的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的Spark版本是2.1.1,Scala版本是2.11

  import org.apache.spark。 SparkContext._ 
import com.mufu.wcsa.component.dimension。{DimensionKey,KeyTrait}
import com.mufu.wcsa.log.LogRecord
import org.apache.spark.rdd。 RDD

object PV {

//
def stat [C <:LogRecord,K <:DimensionKey](statTrait:KeyTrait [C,K] ,logRecords:RDD [C]):RDD [(K,Int)] = {
val t = logRecords.map(record =>(statTrait.getKey(record),1))。reduceByKey ,y)=> x + y)

我得到这个错误

  at 1502387780429 
[错误] / Users / lemanli / work / project / newcma / wcsa / wcsa_my / wcsavistor / src / main / scala / com /mufu/wcsa/component/stat/PV.scala:25:error:value reduceByKey不是org.apache.spark.rdd.RDD的成员[(K,Int)]
[错误] val t = logRecords.map(record =>(statTrait.getKey(record),1))。reduceByKey((x,y)=> x + y)

有定义了一个特性

  trait KeyTrait [C<:LogRecord,K<:DimensionKey] {
def getKey c:C):
}







  def stat [C<:LogRecord,K<:DimensionKey: ClassTag:Ordering](statTrait:KeyTrait [C,K],logRecords:RDD [C]):RDD [(K,Int)] = {
val t = logRecords.map(record =>(statTrait。 getKey(record),1))。reduceByKey((x,y)=> x + y)

键需要覆盖Ordering [T]。

  object ClientStat extends KeyTrait [DetailLogRecord,ClientStat] {
implicit val c

lientStatSorting = new Ordering [ClientStat] {
覆盖def compare(x:ClientStat,y:ClientStat):Int = x.key.compare(y.key)
}
$ b $ get getKey(detailLogRecord:DetailLogRecord):ClientStat = new ClientStat(detailLogRecord)
}


解决方案

这来自一般使用一对rdd函数。 reduceByKey 方法实际上是 PairRDDFunctions 类的一种方法,它有一个从 RDD

 隐式def rddToPairRDDFunctions [K,V](rdd:RDD [(K,V )])
(隐式kt:ClassTag [K],vt:ClassTag [V],ord:Ordering [K] = null):PairRDDFunctions [K,V]

所以它需要几个隐含的类型类。通常在使用简单的混凝土类型时,这些已经在范围之内。但是你应该可以修改你的方法来需要相同的含义:
$ b $ pre $ def c stat stat [C

或者使用更新的语法:

  def stat [ C<:LogRecord,K<:DimensionKey:ClassTag:Ordering](statTrait:KeyTrait [C,K],logRecords:RDD [C])


It's very sad.My spark version is 2.1.1,Scala version is 2.11

import org.apache.spark.SparkContext._
import com.mufu.wcsa.component.dimension.{DimensionKey, KeyTrait}
import com.mufu.wcsa.log.LogRecord
import org.apache.spark.rdd.RDD

object PV {

//
  def stat[C <: LogRecord,K <:DimensionKey](statTrait: KeyTrait[C ,K],logRecords: RDD[C]): RDD[(K,Int)] = {
    val t = logRecords.map(record =>(statTrait.getKey(record),1)).reduceByKey((x,y) => x + y)

I got this error

at 1502387780429
[ERROR] /Users/lemanli/work/project/newcma/wcsa/wcsa_my/wcsavistor/src/main/scala/com/mufu/wcsa/component/stat/PV.scala:25: error: value reduceByKey is not a member of org.apache.spark.rdd.RDD[(K, Int)]
[ERROR]     val t = logRecords.map(record =>(statTrait.getKey(record),1)).reduceByKey((x,y) => x + y)

there is defined a trait

trait KeyTrait[C <: LogRecord,K <: DimensionKey]{
  def getKey(c:C):K
}

It is compiled,Thanks.

 def stat[C <: LogRecord,K <:DimensionKey : ClassTag : Ordering](statTrait: KeyTrait[C ,K],logRecords: RDD[C]): RDD[(K,Int)] = {
    val t = logRecords.map(record =>(statTrait.getKey(record),1)).reduceByKey((x,y) => x + y)

Key need to override Ordering[T].

  object ClientStat extends KeyTrait[DetailLogRecord, ClientStat] {
      implicit val c

lientStatSorting = new Ordering[ClientStat] {
    override def compare(x: ClientStat, y: ClientStat): Int = x.key.compare(y.key)
  }

      def getKey(detailLogRecord: DetailLogRecord): ClientStat = new ClientStat(detailLogRecord)
    }

解决方案

This comes from using a pair rdd function generically. The reduceByKey method is actually a method of the PairRDDFunctions class, which has an implicit conversion from RDD:

implicit def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)])
    (implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] = null): PairRDDFunctions[K, V]

So it requires several implicit typeclasses. Normally when working with simple concrete types, those are already in scope. But you should be able to amend your method to also require those same implicits:

def stat[C <: LogRecord,K <:DimensionKey](statTrait: KeyTrait[C ,K],logRecords: RDD[C])(implicit kt: ClassTag[K], ord: Ordering[K])

Or using the newer syntax:

def stat[C <: LogRecord,K <:DimensionKey : ClassTag : Ordering](statTrait: KeyTrait[C ,K],logRecords: RDD[C])

这篇关于值reduceByKey不是org.apache.spark.rdd.RDD的成员的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆