Hadoop和Eclipse中的reducer数量 [英] Hadoop and number of reducers in Eclipse

查看:140
本文介绍了Hadoop和Eclipse中的reducer数量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我的mapReduce程序中,我必须使用一个分区:

In my mapReduce program, i have to use a Partitionner :

public class TweetPartitionner extends HashPartitioner<Text, IntWritable>{

    public int getPartition(Text a_key, IntWritable a_value, int a_nbPartitions) {
        if(a_key.toString().startsWith("#"))
            return 0;
        else
            return 1;
    }

}

我已经设置了减少任务: job.setNumReduceTasks(2);

And I have set the number of reduce tasks : job.setNumReduceTasks(2);

但是我收到以下错误: java.io.IOException:#rescinfo(1)的非法分区

But I get the following error : java.io.IOException: Illegal partition for #rescinfo (1)

参数 a_nbPartitions 返回 1

我已阅读另一篇文章:
Hadoop:减速机数量不相等到我在程序中设置的

I've read in another post : Hadoop: Number of reducer is not equal to what I have set in program that


在eclipse中运行它似乎使用本地作业运行器。它只有
支持0或1个reducer。如果您尝试将其设置为使用多个
reducer,它将忽略它,只会使用一个。

Running it in eclipse seems to use the local job runner. It only supports 0 or 1 reducers. If you try to set it to use more than one reducer, it ignores it and just uses one anyway.

I在安装在Cygwin上的Hadoop 0.20.2上,我当然使用Eclipse。我如何做?

I developp on a Hadoop 0.20.2 installed on Cygwin and I of course use Eclipse. How can I do ?

推荐答案

您实际上不需要专用的Hadoop集群。只是您必须告诉Eclipse,您打算在伪分布式集群上运行此作业,而不是在本地运行此作业。为此,您需要在代码中添加以下行:

You actually don't need a dedicated Hadoop cluster for that. It's just that you must tell Eclipse that you intend to run this job on your pseudo-distributed cluster and not to run locally within itself. To do that you need to add these lines in your code :

Configuration conf = new Configuration();
conf.set("fs.default.name", "hdfs://localhost:9000");
conf.set("mapred.job.tracker", "localhost:9001");

之后,将reducer的数量设置为2:

And after that set the number of reducers to 2 via :

job.setNumReduceTasks(2);

是的,你必须非常确定你的分区逻辑。你可以访问这个,其中显示了如何编写自定义分区。

And yes, you have to be very sure about your partitioner logic. You can visit this page which shows how to write a custom partitioner.

HTH

这篇关于Hadoop和Eclipse中的reducer数量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆