值reduceByKey不是org.apache.spark.rdd.RDD的成员 [英] value reduceByKey is not a member of org.apache.spark.rdd.RDD
问题描述
import org.apache.spark。 SparkContext._
import com.mufu.wcsa.component.dimension。{DimensionKey,KeyTrait}
import com.mufu.wcsa.log.LogRecord
import org.apache.spark.rdd。 RDD
object PV {
//
def stat [C <:LogRecord,K <:DimensionKey](statTrait:KeyTrait [C,K] ,logRecords:RDD [C]):RDD [(K,Int)] = {
val t = logRecords.map(record =>(statTrait.getKey(record),1))。reduceByKey ,y)=> x + y)
我得到这个错误
at 1502387780429
[错误] / Users / lemanli / work / project / newcma / wcsa / wcsa_my / wcsavistor / src / main / scala / com /mufu/wcsa/component/stat/PV.scala:25:error:value reduceByKey不是org.apache.spark.rdd.RDD的成员[(K,Int)]
[错误] val t = logRecords.map(record =>(statTrait.getKey(record),1))。reduceByKey((x,y)=> x + y)
有定义了一个特性
trait KeyTrait [C<:LogRecord,K<:DimensionKey] {
def getKey c:C):
}
def stat [C<:LogRecord,K<:DimensionKey: ClassTag:Ordering](statTrait:KeyTrait [C,K],logRecords:RDD [C]):RDD [(K,Int)] = {
val t = logRecords.map(record =>(statTrait。 getKey(record),1))。reduceByKey((x,y)=> x + y)
键需要覆盖Ordering [T]。
object ClientStat extends KeyTrait [DetailLogRecord,ClientStat] {
implicit val c
lientStatSorting = new Ordering [ClientStat] {
覆盖def compare(x:ClientStat,y:ClientStat):Int = x.key.compare(y.key)
}
$ b $ get getKey(detailLogRecord:DetailLogRecord):ClientStat = new ClientStat(detailLogRecord)
}
这来自一般使用一对rdd函数。 reduceByKey
方法实际上是 PairRDDFunctions
类的一种方法,它有一个从 RDD
:
隐式def rddToPairRDDFunctions [K,V](rdd:RDD [(K,V )])
(隐式kt:ClassTag [K],vt:ClassTag [V],ord:Ordering [K] = null):PairRDDFunctions [K,V]
所以它需要几个隐含的类型类。通常在使用简单的混凝土类型时,这些已经在范围之内。但是你应该可以修改你的方法来需要相同的含义:
$ b $ pre $ def c stat stat [C
或者使用更新的语法:
def stat [ C<:LogRecord,K<:DimensionKey:ClassTag:Ordering](statTrait:KeyTrait [C,K],logRecords:RDD [C])
It's very sad.My spark version is 2.1.1,Scala version is 2.11
import org.apache.spark.SparkContext._
import com.mufu.wcsa.component.dimension.{DimensionKey, KeyTrait}
import com.mufu.wcsa.log.LogRecord
import org.apache.spark.rdd.RDD
object PV {
//
def stat[C <: LogRecord,K <:DimensionKey](statTrait: KeyTrait[C ,K],logRecords: RDD[C]): RDD[(K,Int)] = {
val t = logRecords.map(record =>(statTrait.getKey(record),1)).reduceByKey((x,y) => x + y)
I got this error
at 1502387780429
[ERROR] /Users/lemanli/work/project/newcma/wcsa/wcsa_my/wcsavistor/src/main/scala/com/mufu/wcsa/component/stat/PV.scala:25: error: value reduceByKey is not a member of org.apache.spark.rdd.RDD[(K, Int)]
[ERROR] val t = logRecords.map(record =>(statTrait.getKey(record),1)).reduceByKey((x,y) => x + y)
there is defined a trait
trait KeyTrait[C <: LogRecord,K <: DimensionKey]{
def getKey(c:C):K
}
It is compiled,Thanks.
def stat[C <: LogRecord,K <:DimensionKey : ClassTag : Ordering](statTrait: KeyTrait[C ,K],logRecords: RDD[C]): RDD[(K,Int)] = {
val t = logRecords.map(record =>(statTrait.getKey(record),1)).reduceByKey((x,y) => x + y)
Key need to override Ordering[T].
object ClientStat extends KeyTrait[DetailLogRecord, ClientStat] {
implicit val c
lientStatSorting = new Ordering[ClientStat] {
override def compare(x: ClientStat, y: ClientStat): Int = x.key.compare(y.key)
}
def getKey(detailLogRecord: DetailLogRecord): ClientStat = new ClientStat(detailLogRecord)
}
This comes from using a pair rdd function generically. The reduceByKey
method is actually a method of the PairRDDFunctions
class, which has an implicit conversion from RDD
:
implicit def rddToPairRDDFunctions[K, V](rdd: RDD[(K, V)])
(implicit kt: ClassTag[K], vt: ClassTag[V], ord: Ordering[K] = null): PairRDDFunctions[K, V]
So it requires several implicit typeclasses. Normally when working with simple concrete types, those are already in scope. But you should be able to amend your method to also require those same implicits:
def stat[C <: LogRecord,K <:DimensionKey](statTrait: KeyTrait[C ,K],logRecords: RDD[C])(implicit kt: ClassTag[K], ord: Ordering[K])
Or using the newer syntax:
def stat[C <: LogRecord,K <:DimensionKey : ClassTag : Ordering](statTrait: KeyTrait[C ,K],logRecords: RDD[C])
这篇关于值reduceByKey不是org.apache.spark.rdd.RDD的成员的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!