继续运行hadoop分布式模式失败 [英] keep failing in running hadoop distributed mode

查看：137 发布时间：2018/5/31 19:50:25 java hadoop

本文介绍了继续运行hadoop分布式模式失败的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

很长时间以来我一直在困扰这个问题。
我尝试在distibuted节点中运行某些东西。
我有2个datanode和masternode和jobtracker。
我在每个节点的tasktracker.log中都收到以下错误：

< 2012-01-03 08：48：30,910 WARN mortbay.log - / mapOutput：org.apache.hadoop.util.DiskChecker $ DiskErrorException：找不到taskTracker / jobcache / job_201201031846_0001 / attempt_201201031846_0001_m_000000_1 / output / file.out .index在任何配置的本地目录中 2012-01-03 08:48:40,927警告mapred.TaskTracker - getMapOutput（attempt_201201031846_0001_m_000000_2,0）失败： org.apache.hadoop.util.DiskChecker $ DiskErrorException：无法在任何配置的本地目录中找到taskTracker / jobcache / job_201201031846_0001 / attempt_201201031846_0001 / attempt_201201031846_0001_m_000000_2 / output / file.out.index at org.apache.hadoop.fs.LocalDirAllocator $ AllocatorPerContext.getLocalPathToRead（LocalDirAllocator.java： 389） at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead（LocalDirAllocator.java:138） at org.apache.hadoop.mapred.TaskTracker $ MapOutputServlet.doGet（TaskTracker.java:2887）位于javax.servlet.http.HttpServlet.service（HttpServlet.java:707） at javax.servlet.http.HttpServlet.service（HttpServlet.java:820） at org.mortbay.jetty.servlet.ServletHolder.handle（ServletHolder.java:502） at org.mortbay .jetty.servlet.ServletHandler.handle（ServletHandler.java:363） at org.mortbay.jetty.security.SecurityHandler.handle（SecurityHandler.java:216） at org.mortbay.jetty.servlet .SessionHandler.handle（SessionHandler.java:181） at org.mortbay.jetty.handler.ContextHandler.handle（ContextHandler.java:766） at org.mortbay.jetty.webapp.WebAppContext.handle （WebAppContext.java:417） at org.mortbay.jetty.handler.ContextHandlerCollection.handle（ContextHandlerCollection.java:230） at org.mortbay.jetty.handler.HandlerWrapper.handle（HandlerWrapper.java ：152） at org.mortbay.jetty.Server.handle（Server.java:324） at org.mortbay.jetty.HttpConnection.handleRequest（HttpConnection.java:534） at org.mortbay.jetty.HttpConnection $ RequestHandler.headerComplete（HttpConnect at org.mortbay.jetty.HttpParser.parseNext（HttpParser.java:533） at org.mortbay.jetty.HttpParser.parseAvailable（HttpParser.java:207） at org.mortbay.jetty.HttpConnection.handle（HttpConnection.java:403） at org.mortbay.io.nio.SelectChannelEndPoint.run（SelectChannelEndPoint.java:409） at org.mortbay .thread.QueuedThreadPool $ PoolThread.run（QueuedThreadPool.java:522） >
以及奴隶的hadoop.log中的这个错误：

2012-01-03 10：20：36,732警告mapred.ReduceTask - attempt_201201031954_0006_r_000001_0将主机localhost添加到惩罚框中，下一次联系在4秒内 2012-01- 03 10：20：41,738警告mapred.ReduceTask - attempt_201201031954_0006_r_000001_0复制失败：attempt_201201031954_0006_m_000001_2 from localhost 2012-01-03 10：20：41,738警告mapred.ReduceTask - java.io.FileNotFoundException：http：// localhost：50060 / mapOutput？job = job_201201031954_0006& map = attempt_201201031954_0006_m_000001_2& reduce = 1 at sun.reflect.GeneratedConstructorAccessor6.newInstance（Unknown Source） at sun.reflect.DelegatingConstructorAccessorImpl.newInstance（DelegatingConstructorAccessorImpl.java:27） at java.lang.reflect.Constructor.newInstance（Constructor.java:513） at sun.net.www.protocol.http.HttpURLConnection $ 6.run（HttpURLConnection.java:1491）在java.securi ty.AccessController.doPrivileged（本地方法）在sun.net.www.protocol.http.HttpURLConnection.getChainedException（HttpURLConnection.java:1485）在sun.net.www.protocol.http.HttpURLConnection .getInputStream（HttpURLConnection.java:1139） at org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.getInputStream（ReduceTask.java:1447） at org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.getMapOutput（ReduceTask.java:1349） at org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.copyOutput（ReduceTask.java:1261） at org.apache.hadoop .mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.run（ReduceTask.java:1195）导致：java.io.FileNotFoundException：http：// localhost：50060 / mapOutput？job = job_201201031954_0006& map = attempt_201201031954_0006_m_000001_2& reduce = 1 at sun.net.www.protocol.http.HttpURLConnection.getInputStream（HttpURLConnection.java:1434） ... 4 more 2012-01-03 10 ：20：41,739 WARR mapred.ReduceTask - attempt_201201031954_0006_r_000001_0将主机本地主机添加到惩罚框中，在4秒内接触下一个 2012-01-03 10：20：46,761警告mapred.ReduceTask - attempt_201201031954_0006_r_000001_0复制失败：attempt_201201031954_0006_m_000000_3 from localhost 2012-01-03 10：20：46,762警告mapred.ReduceTask - java.io.FileNotFoundException：http：// localhost：50060 / mapOutput？job = job_201201031954_0006& map = attempt_201201031954_0006_m_000000_3& reduce = 1 at sun .reflect.GeneratedConstructorAccessor6.newInstance（Unknown Source） at sun.reflect.DelegatingConstructorAccessorImpl.newInstance（DelegatingConstructorAccessorImpl.java:27） at java.lang.reflect.Constructor.newInstance（Constructor.java:513） at sun.net.www.protocol.http.HttpURLConnection $ 6.run（HttpURLConnection.java:1491） at java.security.AccessController.doPrivileged（Native Method） at sun.net .www.protocol.http.HttpURLConnection.getChainedExc eption（HttpURLConnection.java:1485） at sun.net.www.protocol.http.HttpURLConnection.getInputStream（HttpURLConnection.java:1139） at org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.getInputStream（ReduceTask.java:1447） at org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.getMapOutput（ReduceTask.java:1349） at org.apache.hadoop.mapred。 ReduceTask $ ReduceCopier $ MapOutputCopier.copyOutput（ReduceTask.java:1261） at org.apache.hadoop.mapred.ReduceTask $ ReduceCopier $ MapOutputCopier.run（ReduceTask.java:1195）引起：java。 io.FileNotFoundException：http：// localhost：50060 / mapOutput？job = job_201201031954_0006& map = attempt_201201031954_0006_m_000000_3& reduce = 1 at sun.net.www.protocol.http.HttpURLConnection.getInputStream（HttpURLConnection.java:1434） ... 4 more
这是我的配置：

mapred-site：

<属性> <名称> mapred.job.tracker< / name> < value> 10.20.1.112：9001< / value> < description> MapReduce作业跟踪器运行 at的主机和端口。< / description> < / property> <属性> <名称> mapred.map.tasks< / name> <值> 2< /值> < description> 将mapred.map任务定义为从属主机的数量 < / description> < / property> <属性> <名称> mapred.reduce.tasks< / name> <值> 2< /值> < description> 将mapred.reduce任务定义为从属主机数量 < / property> <属性> <名称> mapred.system.dir< /名称> <值> filesystem / mapreduce / system< / value> < / property> <属性> <名称> mapred.local.dir< / name> <值> filesystem / mapreduce / local< / value> < / property> <属性> <名称> mapred.submit.replication< / name> <值> 2< /值> < / property> <属性> < name> hadoop.tmp.dir< / name> <值> tmp< /值> < / property> <属性> <名称> mapred.child.java.opts< / name> <值> -Xmx2048m< /值> < / property>
核心站点：

<性> <名称> fs.default.name< /名称> <值> hdfs：//10.20.1.112：9000< /值> < description>默认文件系统的名称。一个URI，其模式和权限决定了FileSystem的实现。 < / description> < / property>
我试过玩tmp dir - 没有帮助。
我尝试过使用mapred.local.dir - 没有帮助。

我也厌倦了在运行时查看文件系统目录中的内容。
我发现路径：taskTracker / jobcache / job_201201031846_0001 / attempt_201201031846_0001_m_000000_1 /
存在，但它没有输出文件夹。

任何想法？

谢谢。
解决方案
在这里，我想问题是：您的tasktracker想要询问master的映射输出，所以它应该是： p>

http://10.20.1.112:50060/mapOutput?job=job_201201031954_0006&map=attempt_201201031954_0006_m_000001_2&reduce=1
但是在你的tasknode中，它试图从

http：// localhost：50060 / mapOutput？job = job_201201031954_0006& map = attempt_201201031954_0006_m_000001_2& reduce = 1
所以问题发生了，主要问题不是hadoop.tmp.dir，mapred.system.dir和mapred.local.dir，我也面临这个问题，我解决了通过删除master / etc / hosts中的127.0.0.1 localhost问题，也许你可以试试它！

编辑

总之，请转到导致错误的节点文件结构中的 etc / hosts 文件，并删除l ine 127.0.0.1 localhost

I'm stuck on this problem for a very long time. I try to run something in distibuted node. I have 2 datanodes and a master with namenode and jobtracker. I keep getting the following error in tasktracker.log of each of the nodes
< 2012-01-03 08:48:30,910 WARN mortbay.log - /mapOutput: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201201031846_0001/attempt_201201031846_0001_m_000000_1/output/file.out.index in any of the configured local directories 2012-01-03 08:48:40,927 WARN mapred.TaskTracker - getMapOutput(attempt_201201031846_0001_m_000000_2,0) failed : org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201201031846_0001/attempt_201201031846_0001_m_000000_2/output/file.out.index in any of the configured local directories at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:389) at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138) at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2887) at javax.servlet.http.HttpServlet.service(HttpServlet.java:707) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:324) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522) >
and this error in hadoop.log of the slave:
2012-01-03 10:20:36,732 WARN mapred.ReduceTask - attempt_201201031954_0006_r_000001_0 adding host localhost to penalty box, next contact in 4 seconds 2012-01-03 10:20:41,738 WARN mapred.ReduceTask - attempt_201201031954_0006_r_000001_0 copy failed: attempt_201201031954_0006_m_000001_2 from localhost 2012-01-03 10:20:41,738 WARN mapred.ReduceTask - java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_201201031954_0006&map=attempt_201201031954_0006_m_000001_2&reduce=1 at sun.reflect.GeneratedConstructorAccessor6.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1491) at java.security.AccessController.doPrivileged(Native Method) at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1485) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1447) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1349) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) Caused by: java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_201201031954_0006&map=attempt_201201031954_0006_m_000001_2&reduce=1 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1434) ... 4 more 2012-01-03 10:20:41,739 WARN mapred.ReduceTask - attempt_201201031954_0006_r_000001_0 adding host localhost to penalty box, next contact in 4 seconds 2012-01-03 10:20:46,761 WARN mapred.ReduceTask - attempt_201201031954_0006_r_000001_0 copy failed: attempt_201201031954_0006_m_000000_3 from localhost 2012-01-03 10:20:46,762 WARN mapred.ReduceTask - java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_201201031954_0006&map=attempt_201201031954_0006_m_000000_3&reduce=1 at sun.reflect.GeneratedConstructorAccessor6.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1491) at java.security.AccessController.doPrivileged(Native Method) at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1485) at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1447) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1349) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261) at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195) Caused by: java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_201201031954_0006&map=attempt_201201031954_0006_m_000000_3&reduce=1 at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1434) ... 4 more
This is my configuration:

mapred-site:
<property> <name>mapred.job.tracker</name> <value>10.20.1.112:9001</value> <description>The host and port that the MapReduce job tracker runs at.</description> </property> <property> <name>mapred.map.tasks</name> <value>2</value> <description> define mapred.map tasks to be number of slave hosts </description> </property> <property> <name>mapred.reduce.tasks</name> <value>2</value> <description> define mapred.reduce tasks to be number of slave hosts </description> </property> <property> <name>mapred.system.dir</name> <value>filesystem/mapreduce/system</value> </property> <property> <name>mapred.local.dir</name> <value>filesystem/mapreduce/local</value> </property> <property> <name>mapred.submit.replication</name> <value>2</value> </property> <property> <name>hadoop.tmp.dir</name> <value>tmp</value> </property> <property> <name>mapred.child.java.opts</name> <value>-Xmx2048m</value> </property>
core-site:
<property> <name>fs.default.name</name> <value>hdfs://10.20.1.112:9000</value> <description>The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. </description> </property>
I've tried playing with tmp dir - didnt help. I've tried playing with mapred.local.dir - didn't help.

I also tired to see what is in the filesystem dir during runtime. I found that the path : taskTracker/jobcache/job_201201031846_0001/attempt_201201031846_0001_m_000000_1/ exists, but it doesn't have output folder in it.

any idea?

thanks.
解决方案
Here I think the question is: Your tasktracker wants to ask the map output from master, so it should be:
http://10.20.1.112:50060/mapOutput?job=job_201201031954_0006&map=attempt_201201031954_0006_m_000001_2&reduce=1
but in your tasknode, it tried to get it from
http://localhost:50060/mapOutput?job=job_201201031954_0006&map=attempt_201201031954_0006_m_000001_2&reduce=1
so the problem occurs, and the main problem is not hadoop.tmp.dir, mapred.system.dir and mapred.local.dir, I'm facing this problem too, and I resolved the problem by deleting the "127.0.0.1 localhost" in /etc/hosts of master, maybe you can try it!

EDIT

In summary, go to the etc/hosts file in the file structure of the node that's causing the error and remove the line 127.0.0.1 localhost

这篇关于继续运行hadoop分布式模式失败的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

继续运行hadoop分布式模式失败 [英] keep failing in running hadoop distributed mode

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录关闭

继续运行hadoop分布式模式失败 [英] keep failing in running hadoop distributed mode

问题描述

相关文章

Java开发最新文章

热门教程

热门工具

登录 关闭

登录关闭