基准测试期间出错排序Hadoop2 - 分区不匹配 [英] Error during benchmarking Sort in Hadoop2 - Partitions do not match

查看:236
本文介绍了基准测试期间出错排序Hadoop2 - 分区不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在试图对Hadoop2 MapReduce框架进行基准测试。它不是TeraSort。但是 testmapredsort



step-1
创建随机数据:

  hadoop jar hadoop / randomwriter -Dtest.randomwrite.bytes_per_map = 100 -Dtest.randomwriter.maps_per_host = 10 / data / unsorted-data 

步骤2 对在步骤1中创建的随机数据进行排序: / p>

  hadoop jar hadoop / share / hadoop / mapreduce / hadoop-mapreduce-examples-2.2.0.jar sort / data / unsorted- data / data / sorted-data 

step-3 检查排序由 MR 工作:

  hadoop jar hadoop / share / hadoop / mapreduce /hadoop-mapreduce-client-jobclient-2.2.0-tests.jar testmapredsort -sortInput / data / unsorted-data -sortOutput / data / sorted-data 

在步骤3中出现以下错误。我想知道如何解决这个错误。

  java.lang.Exception:java.io.IOException:分区不要匹配记录#0! - '0'v / s'5'
在org.apache.hadoop.mapred.LocalJobRunner $ Job.run(LocalJobRunner.java:403)
导致:java.io.IOException:分区执行不匹配记录#0! - '0'v / s'5'
at org.apache.hadoop.mapred.SortValidator $ RecordStatsChecker $ Map.map(SortValidator.java:266)
at org.apache.hadoop.mapred。 SortValidator $ RecordStatsChecker $ Map.map(SortValidator.java:191)
在org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
在org.apache.hadoop.mapred。 MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner $ Job $ MapTaskRunnable.run(LocalJobRunner.java:235)
在java.util.concurrent.Executors $ RunnableAdapter.call(Executors.java:439)
在java.util.concurrent.FutureTask $ Sync.innerRun( FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor $ Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread。 java:695)
14/08/18 11:07:39信息mapreduce.Job:Job job_local2061890210_0001失败,状态为失败,原因是:NA
14/08/18 11:07:39信息mapreduce。作业:计数器:23
文件系统计数器
FILE:读取的字节数= 1436271
FILE:写入的字节数= 1645526
FILE:读取操作数量= 0
FILE:大量读取操作的数量= 0
FILE:写入操作的数量= 0
HDFS:读取的字节数= 1077294840
HDFS:写入的字节数= 0
HDFS:读取操作次数= 13
HDFS:大量读取操作数量= 0
HDFS:写入操作次数= 1
Map-Reduce Framework
映射输入记录= 102247
地图输出记录= 102247
地图输出字节数= 1328251
地图输出物化字节数= 26
输入拆分字节数= 102
组合输入记录= 102247
合并输出记录= 1
溢出记录= 1
失败Shuffles = 0
合并映射输出= 0
已用GC时间(ms)= 22
已确认的堆使用总数(字节)= 198766592
文件输入格式计数器
读取的字节数= 1077294840
java.io.IOException:作业失败!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
at org.apache.hadoop.mapred.SortValidator $ RecordStatsChecker.checkRecords(SortValidator.java:367)
at org.apache.hadoop.mapred.SortValidator.run(SortValidator.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org .apache.hadoop.mapred.SortValidator.main(SortValidator.java:594)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache .hadoop.util.ProgramDriver $ ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop .test.MapredTestDriver.run(MapredTestDriver.java:115)
at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:123)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java :39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org。 apache.hadoop.util.RunJar.main(RunJar.java:212)

编辑

  hadoop fs -ls / data / unsorted-data 
-rw -r - r- - 3 david supergroup 0 2014-08-14 12:45 / data / unsorted-data / _SUCCESS
-rw -r - r-- 3 david supergroup 1077294840 2014-08-14 12:45 / data / unsorted -data / part-m-00000

hadoop fs -ls / data / sorted-data
-rw -r - r-- 3 david supergroup 0 2014-08-14 12: 55 / data / sorted-data / _SUCCESS
-rw -r - r-- 3 david supergroup 137763270 2014-08-14 12:55 / data / sorted-data / part-m-00000
-rw-r - r-- 3 david supergroup 134220478 2014-08-14 12:55 / data / sorted-data / part-m-00001
-rw-r - r-- 3 david supergroup 134219656 2014-08-14 12:55 / data / sorted-data / part-m-00002
-rw -r - r-- 3 david supergroup 134218029 2014-08-14 12:55 / data / sorted-data / part-m-00003
-rw -r - r-- 3 david supergroup 134219244 2014-08-14 12:55 / data / sorted-data / part-m-00004
-rw-r - r-- 3 david supergroup 134220252 2014-08-14 12:55 / data / sorted-data / part-m-00005
-rw-r - r-- 3 david supergroup 134224231 2014-08-14 12:55 / data / sorted-data / part-m-00006
-rw -r - r-- 3 david supergroup 134210232 2014-08-14 12:55 / data / sorted -data / part-m-00007


解决方案

除了将密钥从 test.randomwrite.bytes_per_map test.randomwriter.maps_per_host 更改为 mapreduce。 randomwriter.bytespermap mapreduce.randomwriter.mapsperhost 导致设置不是ge t到randomwriter,您在 / data / sorted-data 下列出的文件名所指出的问题的核心是您的排序数据由 map outputs,而正确排序的输出只来自 reduce 输出;基本上,您的 sort 命令仅执行排序的映射部分,并且从不在后续缩减阶段执行合并。因此,您的 testmapredsort 命令正确报告排序无效。



检查< a href =http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-mapreduce-examples/2.2.0-cdh5.0.0-beta-1/ org / apache / hadoop / examples / Sort.java#85rel =nofollow> Sort.java 你可以看到实际上并没有保护 num_reduces 以某种方式设置为0; Hadoop MR的典型行为是将还原数设置为0表示仅映射作业,其中映射输出直接转到HDFS,而不是作为减少任务传递的中间输出。这里是相关的行:

  85 int num_reduces =(int)(cluster.getMaxReduceTasks()* 0.9); 
86 String sort_reduces = conf.get(REDUCES_PER_HOST);
87 if(sort_reduces!= null){
88 num_reduces = cluster.getTaskTrackers()*
89 Integer.parseInt(sort_reduces);
90
$ / code>

现在,在一个正常的设置中,所有使用default设置应该提供非零的缩减数量,以便排序起作用。我可以通过运行来重新生成问题:

  hadoop jar share / hadoop / mapreduce / hadoop-mapreduce-examples-2.2。 0.jar sort -r 0 / data / unsorted-data / data / sorted-data 

使用 -r 0 强制0减少。在你的情况下,更可能 cluster.getMaxReduceTasks()返回1(或者如果你的集群坏了,可能甚至为0)。我不知道那种方法可能返回1的所有方法;看来只需将 mapreduce.tasktracker.reduce.tasks.maximum 设置为1并不适用于该方法。进入任务容量的其他因素包括内核数量和可用内存量。

假设您的集群每个TaskTracker至少可以减少1个任务,那么您可以重试您的排序步骤使用 -r 1

  hadoop fs -rmr / data / sorted-data 
hadoop jar share / hadoop / mapreduce / hadoop-mapreduce-examples-2.2.0.jar sort -r 1 / data / unsorted-data / data / sorted-data


I am trying to benchmark Hadoop2 MapReduce framework. It is NOT TeraSort. But testmapredsort.

step-1 Create random data:

hadoop jar hadoop/ randomwriter -Dtest.randomwrite.bytes_per_map=100 -Dtest.randomwriter.maps_per_host=10 /data/unsorted-data

step-2 sort the random data created in step-1:

hadoop jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar sort /data/unsorted-data /data/sorted-data

step-3 check if the sorting by MR works:

hadoop jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar testmapredsort -sortInput /data/unsorted-data -sortOutput /data/sorted-data

I get the following error during step-3. I want to know how to fix this this error.

java.lang.Exception: java.io.IOException: Partitions do not match for record# 0 ! - '0' v/s '5'
    at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)
Caused by: java.io.IOException: Partitions do not match for record# 0 ! - '0' v/s '5'
    at org.apache.hadoop.mapred.SortValidator$RecordStatsChecker$Map.map(SortValidator.java:266)
    at org.apache.hadoop.mapred.SortValidator$RecordStatsChecker$Map.map(SortValidator.java:191)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
    at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
    at java.lang.Thread.run(Thread.java:695)
14/08/18 11:07:39 INFO mapreduce.Job: Job job_local2061890210_0001 failed with state FAILED due to: NA
14/08/18 11:07:39 INFO mapreduce.Job: Counters: 23
    File System Counters
        FILE: Number of bytes read=1436271
        FILE: Number of bytes written=1645526
        FILE: Number of read operations=0
        FILE: Number of large read operations=0
        FILE: Number of write operations=0
        HDFS: Number of bytes read=1077294840
        HDFS: Number of bytes written=0
        HDFS: Number of read operations=13
        HDFS: Number of large read operations=0
        HDFS: Number of write operations=1
    Map-Reduce Framework
        Map input records=102247
        Map output records=102247
        Map output bytes=1328251
        Map output materialized bytes=26
        Input split bytes=102
        Combine input records=102247
        Combine output records=1
        Spilled Records=1
        Failed Shuffles=0
        Merged Map outputs=0
        GC time elapsed (ms)=22
        Total committed heap usage (bytes)=198766592
    File Input Format Counters 
        Bytes Read=1077294840
java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
    at org.apache.hadoop.mapred.SortValidator$RecordStatsChecker.checkRecords(SortValidator.java:367)
    at org.apache.hadoop.mapred.SortValidator.run(SortValidator.java:579)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
    at org.apache.hadoop.mapred.SortValidator.main(SortValidator.java:594)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
    at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
    at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:115)
    at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:123)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

EDIT:

hadoop fs -ls /data/unsorted-data
-rw-r--r--   3 david supergroup          0 2014-08-14 12:45 /data/unsorted-data/_SUCCESS
-rw-r--r--   3 david supergroup 1077294840 2014-08-14 12:45 /data/unsorted-data/part-m-00000

hadoop fs -ls /data/sorted-data
-rw-r--r--   3 david supergroup          0 2014-08-14 12:55 /data/sorted-data/_SUCCESS
-rw-r--r--   3 david supergroup  137763270 2014-08-14 12:55 /data/sorted-data/part-m-00000
-rw-r--r--   3 david supergroup  134220478 2014-08-14 12:55 /data/sorted-data/part-m-00001
-rw-r--r--   3 david supergroup  134219656 2014-08-14 12:55 /data/sorted-data/part-m-00002
-rw-r--r--   3 david supergroup  134218029 2014-08-14 12:55 /data/sorted-data/part-m-00003
-rw-r--r--   3 david supergroup  134219244 2014-08-14 12:55 /data/sorted-data/part-m-00004
-rw-r--r--   3 david supergroup  134220252 2014-08-14 12:55 /data/sorted-data/part-m-00005
-rw-r--r--   3 david supergroup  134224231 2014-08-14 12:55 /data/sorted-data/part-m-00006
-rw-r--r--   3 david supergroup  134210232 2014-08-14 12:55 /data/sorted-data/part-m-00007

解决方案

Aside from the change in keys from test.randomwrite.bytes_per_map and test.randomwriter.maps_per_host to mapreduce.randomwriter.bytespermap and mapreduce.randomwriter.mapsperhost causing the settings to not get through to randomwriter, the core of the problem as indicated by the filenames you listed under /data/sorted-data is that your sorted data consists of map outputs, whereas correctly sorted output only comes from reduce outputs; essentially, your sort command is only performing the map portion of the sort, and never performing the merge in a subsequent reduce stage. Because of this, your testmapredsort command is correctly reporting that the sort did not work.

Checking the code of Sort.java you can see that there is in fact no protection against num_reduces somehow getting set to 0; the typical behavior of Hadoop MR is that setting the number of reduces to 0 indicates a "map only" job, where the map outputs go directly to HDFS rather than being intermediate outputs passed to reduce tasks. Here are the relevant lines:

85     int num_reduces = (int) (cluster.getMaxReduceTasks() * 0.9);
86     String sort_reduces = conf.get(REDUCES_PER_HOST);
87     if (sort_reduces != null) {
88        num_reduces = cluster.getTaskTrackers() * 
89                        Integer.parseInt(sort_reduces);
90     }

Now, in a normal setup, all of that logic using "default" settings should provide a nonzero number of reduces, such that the sort works. I was able to repro your problem by running:

hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar sort -r 0 /data/unsorted-data /data/sorted-data

using the -r 0 to force 0 reduces. In your case, more likely cluster.getMaxReduceTasks() is returning 1 (or possibly even 0 if your cluster is broken). I don't know off the top of my head all the ways that method could return 1; it appears that simply setting mapreduce.tasktracker.reduce.tasks.maximum to 1 doesn't apply to that method. Other factors that go into task capacity include numbers of cores and the amount of memory available.

Assuming your cluster is at least capable of 1 reduce task per TaskTracker, you can retry your sort step using -r 1:

hadoop fs -rmr /data/sorted-data
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar sort -r 1 /data/unsorted-data /data/sorted-data

这篇关于基准测试期间出错排序Hadoop2 - 分区不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆