Hadoop安全模式恢复 - 花费太长时间! [英] Hadoop safemode recovery - taking too long!

查看:157
本文介绍了Hadoop安全模式恢复 - 花费太长时间!的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个有18个数据节点的Hadoop集群。
我在两个小时前重新启动了名称节点,并且名称节点仍处于安全模式。



我一直在寻找为什么这可能需要很长时间我找不到一个好的答案。
这里张贴:
Hadoop安全模式恢复 - 采取很多的时间
是相关的,但我不确定是否需要/需要重新启动名称节点后更改此设置,如该文章提到的:

 <属性> 
< name> dfs.namenode.handler.count< / name>
<值> 3< /值>
< final> true< / final>
< / property>

无论如何,这就是我在'hadoop-hadoop-namenode-hadoop -name-node.log':

  2011-02-11 01:39:55,226 INFO org.apache.hadoop.ipc .Server:8020上的IPC服务器处理程序0,从10.1.206.27:54864调用delete(/ tmp / hadoop-hadoop / mapred / system,true):错误:org.apache.hadoop.hdfs.server.namenode.SafeModeException:Can not删除/ tmp / hadoop-hadoop / mapred / system。名称节点处于安全模式。 
报告的块319128需要额外的7183块才能达到总块326638的阈值0.9990。安全模式将自动关闭。
org.apache.hadoop.hdfs.server.namenode.SafeModeException:无法删除/ tmp / hadoop-hadoop / mapred / system。名称节点处于安全模式。
报告的块319128需要额外的7183块才能达到总块326638的阈值0.9990。安全模式将自动关闭。
位于org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem。org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1711)

at org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:565)
at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
在sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
在java.lang.reflect.Method.invoke(Method.java:616)
在org.apache.hadoop.ipc。 RPC $ Server.call(RPC.java:508)
位于org.apache.hadoop.ipc.Server $ Handler $ 1.run(Server.java:966)
位于org.apache.hadoop.ipc .Server $ Handler $ 1.run(Server.java:962)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:416 )
at org.apache.hadoop.ipc.Server $ Handler.run(Server.java:960)

任何建议表示赞赏。
Thanks!

解决方案

我曾经有过一次,其中一些街区从未报道过。我不得不强迫让namenode leave safemode( hadoop dfsadmin -safemode leave ),然后运行fsck删除缺少的文件。


I have a Hadoop cluster with 18 data nodes. I restarted the name node over two hours ago and the name node is still in safe mode.

I have been searching for why this might be taking too long and I cannot find a good answer. The posting here: Hadoop safemode recovery - taking lot of time is relevant but I'm not sure if I want/need to restart the name node after making a change to this setting as that article mentions:

<property>
 <name>dfs.namenode.handler.count</name>
 <value>3</value>
 <final>true</final>
</property>

In any case, this is what I've been getting in 'hadoop-hadoop-namenode-hadoop-name-node.log':

2011-02-11 01:39:55,226 INFO org.apache.hadoop.ipc.Server: IPC Server handler 0 on 8020, call delete(/tmp/hadoop-hadoop/mapred/system, true) from 10.1.206.27:54864: error: org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /tmp/hadoop-hadoop/mapred/system. Name node is in safe mode.
The reported blocks 319128 needs additional 7183 blocks to reach the threshold 0.9990 of total blocks 326638. Safe mode will be turned off automatically.
org.apache.hadoop.hdfs.server.namenode.SafeModeException: Cannot delete /tmp/hadoop-hadoop/mapred/system. Name node is in safe mode.
The reported blocks 319128 needs additional 7183 blocks to reach the threshold 0.9990 of total blocks 326638. Safe mode will be turned off automatically.
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.deleteInternal(FSNamesystem.java:1711)
    at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.delete(FSNamesystem.java:1691)
    at org.apache.hadoop.hdfs.server.namenode.NameNode.delete(NameNode.java:565)
    at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:616)
    at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:508)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:966)
    at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:962)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:416)
    at org.apache.hadoop.ipc.Server$Handler.run(Server.java:960)

Any advice is appreciated. Thanks!

解决方案

I had it once, where some blocks were never reported in. I had to forcefully let the namenode leave safemode (hadoop dfsadmin -safemode leave) and then run an fsck to delete missing files.

这篇关于Hadoop安全模式恢复 - 花费太长时间!的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆