hadoop-streaming示例运行失败 - 键入来自map的键不匹配 [英] hadoop-streaming example failed to run - Type mismatch in key from map
问题描述
我正在运行$ HADOOP_HOME / bin / hadoop jar $ HADOOP_HOME / hadoop-streaming.jar \
-D stream.map.output.field.separator =。 \
-D stream.num.map.output.key.fields = 4 \
-input myInputDirs \
-output myOutputDir \
-mapper org.apache .hadoop.mapred.lib.IdentityMapper \
-reducer org.apache.hadoop.mapred.lib.IdentityReducer
当IdentityMapper是映射器时,输入文件应该是什么?
我希望看到它可以在某些选定的键上排序,而不是整个键。我的输入文件很简单
aa bb。
cc dd
不知道我错过了什么?我总是得到这个错误
java.lang.Exception:java.io.IOException:类型在映射中的键不匹配:expected org.apache.hadoop.io.Text,recieved org.apache.hadoop.io.LongWritable
在org.apache.hadoop.mapred.LocalJobRunner $ Job.run(LocalJobRunner.java:371)
导致:java.io.IOException:类型在映射中的键不匹配:expected org.apache.hadoop .io.Text,收到org.apache.hadoop.io.LongWritable
这是一个已知的bug,这里是< a href =https://issues.apache.org/jira/browse/MAPREDUCE-1888 =nofollow> JIRA 。该错误已在Hadoop 0.21.0中找到,但我认为它不在Hadoop的任何发行版本中。如果您真的有兴趣解决这个问题,您可以
以下是关于如何应用修补程序的说明。
或者,不要使用IdentityMapper和IdentityReducder,而要使用python / perl脚本,它将从STDIN读取k / v对,然后将相同的k / v对写入STDOUT,而不进行任何处理。这就像创建自己的IdentityMapper和不使用Java的IdentityReducder。
I was running $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar \
-D stream.map.output.field.separator=. \
-D stream.num.map.output.key.fields=4 \
-input myInputDirs \
-output myOutputDir \
-mapper org.apache.hadoop.mapred.lib.IdentityMapper \
-reducer org.apache.hadoop.mapred.lib.IdentityReducer
What hould be the input file when IdentityMapper is the mapper?
I was hoping to see it can sort on certain selected keys and not the entire keys. My input file is simple "aa bb". "cc dd" Not sure what did I miss? I always get this error java.lang.Exception: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:371) Caused by: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable
This is a known bug and here is the JIRA. The bug has been identified in Hadoop 0.21.0, but I don't think it's in any of the Hadoop release version. If you are really interested to fix this, you can
- download the source code for Hadoop (for the release you are working)
- download the patch from JIRA and apply it
- build and test Hadoop
Here are the instructions on how to apply a patch.
Or instead of using an IdentityMapper and the IdentityReducder, use a python/perl scripts which will read the k/v pairs from STDIN and then write the same k/v pairs to the STDOUT without any processing. It's like creating your own IdentityMapper and the IdentityReducder not using Java.
这篇关于hadoop-streaming示例运行失败 - 键入来自map的键不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!