Hadoop选项没有任何作用(mapreduce.input.lineinputformat.linespermap,mapred.max.map.failures.percent) [英] Hadoop options are not having any effect (mapreduce.input.lineinputformat.linespermap, mapred.max.map.failures.percent)
问题描述
我试图实现一个MapReduce作业,其中每个映射器都会占用150行文本文件,所有映射器都可以运行。此外,它不应该失败,无论有多少地图任务失败。
以下是配置部分:
JobConf conf = new JobConf(Main.class);
conf.setJobName(My mapreduce);
conf.set(mapreduce.input.lineinputformat.linespermap,150);
conf.set(mapred.max.map.failures.percent,100);
conf.setInputFormat(NLineInputFormat.class);
FileInputFormat.addInputPath(conf,new Path(args [0]));
FileOutputFormat.setOutputPath(conf,new Path(args [1]));
问题在于,hadoop为每一行文本创建一个映射器,它们似乎按顺序运行,如果一个人失败了,那么这个工作就会失败。
据此推断,我应用的设置没有任何作用。
我做错了什么?
如果您想快速找到正确的名称hadoop的新API的选项,请使用以下链接: http://pydoop.sourceforge.net/docs/examples/intro.html#hadoop-0-21-0-notes 。
I am trying to implement a MapReduce job, where each of the mappers would take 150 lines of the text file, and all the mappers would run simmultaniously; also, it should not fail, no matter how many map tasks fail.
Here's the configuration part:
JobConf conf = new JobConf(Main.class);
conf.setJobName("My mapreduce");
conf.set("mapreduce.input.lineinputformat.linespermap", "150");
conf.set("mapred.max.map.failures.percent","100");
conf.setInputFormat(NLineInputFormat.class);
FileInputFormat.addInputPath(conf, new Path(args[0]));
FileOutputFormat.setOutputPath(conf, new Path(args[1]));
The problem is that hadoop creates a mapper for every single line of text, they seem to run sequentially, and if a single one fails, the job fails.
From this I deduce, that the settings I've applied do not have any effect.
What did I do wrong?
If you want to quickly find the correct names for the options for hadoop's new api, use this link: http://pydoop.sourceforge.net/docs/examples/intro.html#hadoop-0-21-0-notes .
这篇关于Hadoop选项没有任何作用(mapreduce.input.lineinputformat.linespermap,mapred.max.map.failures.percent)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!