基准测试期间出错排序Hadoop2 - 分区不匹配 [英] Error during benchmarking Sort in Hadoop2 - Partitions do not match
问题描述
testmapredsort
。 step-1
创建随机数据:
hadoop jar hadoop / randomwriter -Dtest.randomwrite.bytes_per_map = 100 -Dtest.randomwriter.maps_per_host = 10 / data / unsorted-data
步骤2 对在步骤1中创建的随机数据进行排序: / p>
hadoop jar hadoop / share / hadoop / mapreduce / hadoop-mapreduce-examples-2.2.0.jar sort / data / unsorted- data / data / sorted-data
step-3 检查排序由 MR
工作:
hadoop jar hadoop / share / hadoop / mapreduce /hadoop-mapreduce-client-jobclient-2.2.0-tests.jar testmapredsort -sortInput / data / unsorted-data -sortOutput / data / sorted-data
在步骤3中出现以下错误。我想知道如何解决这个错误。
java.lang.Exception:java.io.IOException:分区不要匹配记录#0! - '0'v / s'5'
在org.apache.hadoop.mapred.LocalJobRunner $ Job.run(LocalJobRunner.java:403)
导致:java.io.IOException:分区执行不匹配记录#0! - '0'v / s'5'
at org.apache.hadoop.mapred.SortValidator $ RecordStatsChecker $ Map.map(SortValidator.java:266)
at org.apache.hadoop.mapred。 SortValidator $ RecordStatsChecker $ Map.map(SortValidator.java:191)
在org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
在org.apache.hadoop.mapred。 MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner $ Job $ MapTaskRunnable.run(LocalJobRunner.java:235)
在java.util.concurrent.Executors $ RunnableAdapter.call(Executors.java:439)
在java.util.concurrent.FutureTask $ Sync.innerRun( FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor $ Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor $ Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread。 java:695)
14/08/18 11:07:39信息mapreduce.Job:Job job_local2061890210_0001失败,状态为失败,原因是:NA
14/08/18 11:07:39信息mapreduce。作业:计数器:23
文件系统计数器
FILE:读取的字节数= 1436271
FILE:写入的字节数= 1645526
FILE:读取操作数量= 0
FILE:大量读取操作的数量= 0
FILE:写入操作的数量= 0
HDFS:读取的字节数= 1077294840
HDFS:写入的字节数= 0
HDFS:读取操作次数= 13
HDFS:大量读取操作数量= 0
HDFS:写入操作次数= 1
Map-Reduce Framework
映射输入记录= 102247
地图输出记录= 102247
地图输出字节数= 1328251
地图输出物化字节数= 26
输入拆分字节数= 102
组合输入记录= 102247
合并输出记录= 1
溢出记录= 1
失败Shuffles = 0
合并映射输出= 0
已用GC时间(ms)= 22
已确认的堆使用总数(字节)= 198766592
文件输入格式计数器
读取的字节数= 1077294840
java.io.IOException:作业失败!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
at org.apache.hadoop.mapred.SortValidator $ RecordStatsChecker.checkRecords(SortValidator.java:367)
at org.apache.hadoop.mapred.SortValidator.run(SortValidator.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org .apache.hadoop.mapred.SortValidator.main(SortValidator.java:594)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java: 39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache .hadoop.util.ProgramDriver $ ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop .test.MapredTestDriver.run(MapredTestDriver.java:115)
at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:123)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java :39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org。 apache.hadoop.util.RunJar.main(RunJar.java:212)
编辑:
hadoop fs -ls / data / unsorted-data
-rw -r - r- - 3 david supergroup 0 2014-08-14 12:45 / data / unsorted-data / _SUCCESS
-rw -r - r-- 3 david supergroup 1077294840 2014-08-14 12:45 / data / unsorted -data / part-m-00000
hadoop fs -ls / data / sorted-data
-rw -r - r-- 3 david supergroup 0 2014-08-14 12: 55 / data / sorted-data / _SUCCESS
-rw -r - r-- 3 david supergroup 137763270 2014-08-14 12:55 / data / sorted-data / part-m-00000
-rw-r - r-- 3 david supergroup 134220478 2014-08-14 12:55 / data / sorted-data / part-m-00001
-rw-r - r-- 3 david supergroup 134219656 2014-08-14 12:55 / data / sorted-data / part-m-00002
-rw -r - r-- 3 david supergroup 134218029 2014-08-14 12:55 / data / sorted-data / part-m-00003
-rw -r - r-- 3 david supergroup 134219244 2014-08-14 12:55 / data / sorted-data / part-m-00004
-rw-r - r-- 3 david supergroup 134220252 2014-08-14 12:55 / data / sorted-data / part-m-00005
-rw-r - r-- 3 david supergroup 134224231 2014-08-14 12:55 / data / sorted-data / part-m-00006
-rw -r - r-- 3 david supergroup 134210232 2014-08-14 12:55 / data / sorted -data / part-m-00007
除了将密钥从 test.randomwrite.bytes_per_map
和 test.randomwriter.maps_per_host
更改为 mapreduce。 randomwriter.bytespermap
和 mapreduce.randomwriter.mapsperhost
导致设置不是ge t到randomwriter,您在 / data / sorted-data
下列出的文件名所指出的问题的核心是您的排序数据由 map outputs,而正确排序的输出只来自 reduce 输出;基本上,您的 sort
命令仅执行排序的映射部分,并且从不在后续缩减阶段执行合并。因此,您的 testmapredsort
命令正确报告排序无效。
检查< a href =http://grepcode.com/file/repository.cloudera.com/content/repositories/releases/org.apache.hadoop/hadoop-mapreduce-examples/2.2.0-cdh5.0.0-beta-1/ org / apache / hadoop / examples / Sort.java#85rel =nofollow> Sort.java 你可以看到实际上并没有保护 num_reduces
以某种方式设置为0; Hadoop MR的典型行为是将还原数设置为0表示仅映射作业,其中映射输出直接转到HDFS,而不是作为减少任务传递的中间输出。这里是相关的行:
85 int num_reduces =(int)(cluster.getMaxReduceTasks()* 0.9);
86 String sort_reduces = conf.get(REDUCES_PER_HOST);
87 if(sort_reduces!= null){
88 num_reduces = cluster.getTaskTrackers()*
89 Integer.parseInt(sort_reduces);
90
$ / code>
现在,在一个正常的设置中,所有使用default设置应该提供非零的缩减数量,以便排序起作用。我可以通过运行来重新生成问题:
hadoop jar share / hadoop / mapreduce / hadoop-mapreduce-examples-2.2。 0.jar sort -r 0 / data / unsorted-data / data / sorted-data
使用 -r 0
强制0减少。在你的情况下,更可能 cluster.getMaxReduceTasks()
返回1(或者如果你的集群坏了,可能甚至为0)。我不知道那种方法可能返回1的所有方法;看来只需将 mapreduce.tasktracker.reduce.tasks.maximum
设置为1并不适用于该方法。进入任务容量的其他因素包括内核数量和可用内存量。
假设您的集群每个TaskTracker至少可以减少1个任务,那么您可以重试您的排序步骤使用 -r 1
:
hadoop fs -rmr / data / sorted-data
hadoop jar share / hadoop / mapreduce / hadoop-mapreduce-examples-2.2.0.jar sort -r 1 / data / unsorted-data / data / sorted-data
I am trying to benchmark Hadoop2 MapReduce framework. It is NOT TeraSort. But testmapredsort
.
step-1 Create random data:
hadoop jar hadoop/ randomwriter -Dtest.randomwrite.bytes_per_map=100 -Dtest.randomwriter.maps_per_host=10 /data/unsorted-data
step-2 sort the random data created in step-1:
hadoop jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar sort /data/unsorted-data /data/sorted-data
step-3 check if the sorting by MR
works:
hadoop jar hadoop/share/hadoop/mapreduce/hadoop-mapreduce-client-jobclient-2.2.0-tests.jar testmapredsort -sortInput /data/unsorted-data -sortOutput /data/sorted-data
I get the following error during step-3. I want to know how to fix this this error.
java.lang.Exception: java.io.IOException: Partitions do not match for record# 0 ! - '0' v/s '5'
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:403)
Caused by: java.io.IOException: Partitions do not match for record# 0 ! - '0' v/s '5'
at org.apache.hadoop.mapred.SortValidator$RecordStatsChecker$Map.map(SortValidator.java:266)
at org.apache.hadoop.mapred.SortValidator$RecordStatsChecker$Map.map(SortValidator.java:191)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:429)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:235)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
at java.lang.Thread.run(Thread.java:695)
14/08/18 11:07:39 INFO mapreduce.Job: Job job_local2061890210_0001 failed with state FAILED due to: NA
14/08/18 11:07:39 INFO mapreduce.Job: Counters: 23
File System Counters
FILE: Number of bytes read=1436271
FILE: Number of bytes written=1645526
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=1077294840
HDFS: Number of bytes written=0
HDFS: Number of read operations=13
HDFS: Number of large read operations=0
HDFS: Number of write operations=1
Map-Reduce Framework
Map input records=102247
Map output records=102247
Map output bytes=1328251
Map output materialized bytes=26
Input split bytes=102
Combine input records=102247
Combine output records=1
Spilled Records=1
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=22
Total committed heap usage (bytes)=198766592
File Input Format Counters
Bytes Read=1077294840
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
at org.apache.hadoop.mapred.SortValidator$RecordStatsChecker.checkRecords(SortValidator.java:367)
at org.apache.hadoop.mapred.SortValidator.run(SortValidator.java:579)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.hadoop.mapred.SortValidator.main(SortValidator.java:594)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:72)
at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
at org.apache.hadoop.test.MapredTestDriver.run(MapredTestDriver.java:115)
at org.apache.hadoop.test.MapredTestDriver.main(MapredTestDriver.java:123)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
EDIT:
hadoop fs -ls /data/unsorted-data
-rw-r--r-- 3 david supergroup 0 2014-08-14 12:45 /data/unsorted-data/_SUCCESS
-rw-r--r-- 3 david supergroup 1077294840 2014-08-14 12:45 /data/unsorted-data/part-m-00000
hadoop fs -ls /data/sorted-data
-rw-r--r-- 3 david supergroup 0 2014-08-14 12:55 /data/sorted-data/_SUCCESS
-rw-r--r-- 3 david supergroup 137763270 2014-08-14 12:55 /data/sorted-data/part-m-00000
-rw-r--r-- 3 david supergroup 134220478 2014-08-14 12:55 /data/sorted-data/part-m-00001
-rw-r--r-- 3 david supergroup 134219656 2014-08-14 12:55 /data/sorted-data/part-m-00002
-rw-r--r-- 3 david supergroup 134218029 2014-08-14 12:55 /data/sorted-data/part-m-00003
-rw-r--r-- 3 david supergroup 134219244 2014-08-14 12:55 /data/sorted-data/part-m-00004
-rw-r--r-- 3 david supergroup 134220252 2014-08-14 12:55 /data/sorted-data/part-m-00005
-rw-r--r-- 3 david supergroup 134224231 2014-08-14 12:55 /data/sorted-data/part-m-00006
-rw-r--r-- 3 david supergroup 134210232 2014-08-14 12:55 /data/sorted-data/part-m-00007
Aside from the change in keys from test.randomwrite.bytes_per_map
and test.randomwriter.maps_per_host
to mapreduce.randomwriter.bytespermap
and mapreduce.randomwriter.mapsperhost
causing the settings to not get through to randomwriter, the core of the problem as indicated by the filenames you listed under /data/sorted-data
is that your sorted data consists of map outputs, whereas correctly sorted output only comes from reduce outputs; essentially, your sort
command is only performing the map portion of the sort, and never performing the merge in a subsequent reduce stage. Because of this, your testmapredsort
command is correctly reporting that the sort did not work.
Checking the code of Sort.java you can see that there is in fact no protection against num_reduces
somehow getting set to 0; the typical behavior of Hadoop MR is that setting the number of reduces to 0 indicates a "map only" job, where the map outputs go directly to HDFS rather than being intermediate outputs passed to reduce tasks. Here are the relevant lines:
85 int num_reduces = (int) (cluster.getMaxReduceTasks() * 0.9);
86 String sort_reduces = conf.get(REDUCES_PER_HOST);
87 if (sort_reduces != null) {
88 num_reduces = cluster.getTaskTrackers() *
89 Integer.parseInt(sort_reduces);
90 }
Now, in a normal setup, all of that logic using "default" settings should provide a nonzero number of reduces, such that the sort works. I was able to repro your problem by running:
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar sort -r 0 /data/unsorted-data /data/sorted-data
using the -r 0
to force 0 reduces. In your case, more likely cluster.getMaxReduceTasks()
is returning 1 (or possibly even 0 if your cluster is broken). I don't know off the top of my head all the ways that method could return 1; it appears that simply setting mapreduce.tasktracker.reduce.tasks.maximum
to 1 doesn't apply to that method. Other factors that go into task capacity include numbers of cores and the amount of memory available.
Assuming your cluster is at least capable of 1 reduce task per TaskTracker, you can retry your sort step using -r 1
:
hadoop fs -rmr /data/sorted-data
hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.2.0.jar sort -r 1 /data/unsorted-data /data/sorted-data
这篇关于基准测试期间出错排序Hadoop2 - 分区不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!