如何在Scala 2.9.0中实现Hadoop Mapper? [英] How does one implement a Hadoop Mapper in Scala 2.9.0?

查看:150
本文介绍了如何在Scala 2.9.0中实现Hadoop Mapper?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我从2.8.1迁移到Scala 2.9.0时,除Hadoop映射器之外,所有代码都是可用的。因为我有一些包装对象的方式,我蒸馏了下面的例子:

When I migrated to Scala 2.9.0 from 2.8.1, all of the code was functional except for the Hadoop mappers. Because I had some wrapper objects in the way, I distilled down to the following example:


import org.apache.hadoop.mapreduce.{Mapper, Job}


object MyJob {
  def main(args:Array[String]) {
    val job = new Job(new Configuration())
    job.setMapperClass(classOf[MyMapper])

  }
}

class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
  override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {

  }
}
When I run this in 2.8.1, it runs quite well (and I have plenty of production code in 2.8.1.  In 2.9.0 I get the following compilation error:
error: type mismatch;
found   : java.lang.Class[MyMapper](classOf[MyMapper])
required: java.lang.Class[_ <: org.apache.hadoop.mapreduce.Mapper]
job.setMapperClass(classOf[MyMapper])

失败调用是在Job对象上调用setMapperClass的时候。下面是该方法的定义:

The failing call is when I call setMapperClass on the Job object. Here's the definition of that method:

public void setMapperClass(java.lang.Class<? extends org.apache.hadoop.mapreduce.Mapper> cls) throws java.lang.IllegalStateException { /* compiled code */ }

Mapper类本身是这样的:

The definition of the Mapper class itself is this:

public class Mapper<KEYIN, VALUEIN, KEYOUT, VALUEOUT>

有人对我做错了什么感觉吗?在我看来,这种类型基本上是正确的:MyMapper扩展了Mapper,并且该方法需要扩展Mapper的东西。它在2.8.1中效果很好......

Does anyone have a sense of what I'm doing wrong? It looks to me like the type is fundamentally correct: MyMapper does extend Mapper, and the method wants something that extends Mapper. And it works great in 2.8.1...

推荐答案

看起来很愚蠢,您可以通过定义在作业之前的Mapper。以下编译:

Silly as it seems, you can work around the problem by defining the Mapper before the Job. The following compiles:

import org.apache.hadoop._
import org.apache.hadoop.io._
import org.apache.hadoop.conf._
import org.apache.hadoop.mapreduce._

class MyMapper extends Mapper[LongWritable,Text,Text,Text] {
  override def map(key: LongWritable, value: Text, context: Mapper[LongWritable,Text,Text,Text]#Context) {
  }
}

object MyJob {
  def main(args:Array[String]) {
    val job = new Job(new Configuration())
    job.setMapperClass(classOf[MyMapper])
  }
}

这篇关于如何在Scala 2.9.0中实现Hadoop Mapper?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆