hadoop-streaming示例运行失败 - 键入来自map的键不匹配 [英] hadoop-streaming example failed to run - Type mismatch in key from map

查看:83
本文介绍了hadoop-streaming示例运行失败 - 键入来自map的键不匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

 我正在运行$ HADOOP_HOME / bin / hadoop jar $ HADOOP_HOME / hadoop-streaming.jar \ 
-D stream.map.output.field.separator =。 \
-D stream.num.map.output.key.fields = 4 \
-input myInputDirs \
-output myOutputDir \
-mapper org.apache .hadoop.mapred.lib.IdentityMapper \
-reducer org.apache.hadoop.mapred.lib.IdentityReducer
当IdentityMapper是映射器时,输入文件应该是什么?

我希望看到它可以在某些选定的键上排序,而不是整个键。我的输入文件很简单
aa bb。
cc dd
不知道我错过了什么?我总是得到这个错误
java.lang.Exception:java.io.IOException:类型在映射中的键不匹配:expected org.apache.hadoop.io.Text,recieved org.apache.hadoop.io.LongWritable
在org.apache.hadoop.mapred.LocalJobRunner $ Job.run(LocalJobRunner.java:371)
导致:java.io.IOException:类型在映射中的键不匹配:expected org.apache.hadoop .io.Text,收到org.apache.hadoop.io.LongWritable

解决方案

这是一个已知的bug,这里是< a href =https://issues.apache.org/jira/browse/MAPREDUCE-1888 =nofollow> JIRA 。该错误已在Hadoop 0.21.0中找到,但我认为它不在Hadoop的任何发行版本中。如果您真的有兴趣解决这个问题,您可以


  • 下载Hadoop的源代码(针对您正在使用的版本)
  • >
  • 从JIRA下载补丁并应用它

  • 构建和测试Hadoop


以下是关于如何应用修补程序的说明



或者,不要使用IdentityMapper和IdentityReducder,而要使用python / perl脚本,它将从STDIN读取k / v对,然后将相同的k / v对写入STDOUT,而不进行任何处理。这就像创建自己的IdentityMapper和不使用Java的IdentityReducder。


I was running  $HADOOP_HOME/bin/hadoop  jar $HADOOP_HOME/hadoop-streaming.jar \
    -D stream.map.output.field.separator=. \
    -D stream.num.map.output.key.fields=4 \
    -input myInputDirs \
    -output myOutputDir \
    -mapper org.apache.hadoop.mapred.lib.IdentityMapper \
    -reducer org.apache.hadoop.mapred.lib.IdentityReducer 
What hould be the input file when IdentityMapper is the mapper?

I was hoping to see it can sort on certain selected keys and not the entire keys. My input file is simple "aa bb". "cc dd" Not sure what did I miss? I always get this error java.lang.Exception: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:371) Caused by: java.io.IOException: Type mismatch in key from map: expected org.apache.hadoop.io.Text, recieved org.apache.hadoop.io.LongWritable

解决方案

This is a known bug and here is the JIRA. The bug has been identified in Hadoop 0.21.0, but I don't think it's in any of the Hadoop release version. If you are really interested to fix this, you can

  • download the source code for Hadoop (for the release you are working)
  • download the patch from JIRA and apply it
  • build and test Hadoop

Here are the instructions on how to apply a patch.

Or instead of using an IdentityMapper and the IdentityReducder, use a python/perl scripts which will read the k/v pairs from STDIN and then write the same k/v pairs to the STDOUT without any processing. It's like creating your own IdentityMapper and the IdentityReducder not using Java.

这篇关于hadoop-streaming示例运行失败 - 键入来自map的键不匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆