Wordcount程序卡在hadoop-2.3.0中 [英] Wordcount program is stuck in hadoop-2.3.0

查看:84
本文介绍了Wordcount程序卡在hadoop-2.3.0中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我安装了hadoop-2.3.0并尝试运行wordcount示例
但它启动了该作业并闲置

  hadoop @ ubuntu:〜$ $ HADOOP_HOME / bin / hadoop jar $ HADOOP_HOME / share / hadoop / mapreduce / hadoop-mapreduce-examples-2.3.0.jar wordcount / myprg outputfile1 
14/04/30 13: 20:40 INFO client.RMProxy:连接到ResourceManager,位于/0.0.0.0:8032
14/04/30 13:20:51 INFO input.FileInputFormat:要处理的总输入路径:1
14 / 04/30 13:20:53信息mapreduce.JobSubmitter:分割次数:1
14/04/30 13:21:02信息mapreduce.JobSubmitter:提交作业的标记:job_1398885280814_0004
14/04 / 30 13:21:07 INFO impl.YarnClientImpl:提交的应用程序application_1398885280814_0004
14/04/30 13:21:09信息mapreduce.Job:跟踪作业的URL:http:// ubuntu:8088 / proxy / application_1398885280814_0004 /
14/04/30 13:21:09信息mapreduce.Job:正在运行的作业:job_1398885280814_0004

追踪j的网址ob:application_1398885280814_0004 /



对于以前的版本,我没有得到这样的问题。我能够在以前的版本中运行hadoop wordcount。
我遵循这些步骤来安装hadoop -2.3.0



请提出建议。 在切换到YARN的同时返回完全相同的情况。基本上MRv1中有任务槽的概念,MRv2中有 containers 的概念。这两者在如何安排任务以及在节点上运行方面有很大不同。



你工作停滞的原因是它无法找到/启动一个容器。如果进入 Resource Manager / Application Master etc守护进程的完整日志,您可能会发现它在开始分配新容器后无所作为。



要解决这个问题,你必须在 yarn-site.xml mapred中调整内存设置-site.xml 。在我自己做同样的事情时,我发现了这个这个教程特别有帮助。我建议您尝试使用非常基本的内存设置,并在稍后进行优化。首先检查一个字数统计的例子,然后进入其他复杂的例子。


I installed hadoop-2.3.0 and tried to run wordcount example But it starts the job and sits idle

hadoop@ubuntu:~$ $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.3.0.jar    wordcount /myprg outputfile1
14/04/30 13:20:40 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
14/04/30 13:20:51 INFO input.FileInputFormat: Total input paths to process : 1
14/04/30 13:20:53 INFO mapreduce.JobSubmitter: number of splits:1
14/04/30 13:21:02 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1398885280814_0004
14/04/30 13:21:07 INFO impl.YarnClientImpl: Submitted application application_1398885280814_0004
14/04/30 13:21:09 INFO mapreduce.Job: The url to track the job: http://ubuntu:8088/proxy/application_1398885280814_0004/
14/04/30 13:21:09 INFO mapreduce.Job: Running job: job_1398885280814_0004

The url to track the job: application_1398885280814_0004/

For previous versions I did nt get such an issue. I was able to run hadoop wordcount in previous version. I followed these steps for installing hadoop-2.3.0

Please suggest.

解决方案

I had the exact same situation a while back while switching to YARN. Basically there was the concept of task slots in MRv1 and containers in MRv2. Both of these differ very much in how the tasks are scheduled and run on the nodes.

The reason that your job is stuck is that it is unable to find/start a container. If you go into the full logs of Resource Manager/Application Master etc daemons, you may find that it is doing nothing after it starts to allocate a new container.

To solve the problem, you have to tweak your memory settings in yarn-site.xml and mapred-site.xml. While doing the same myself, I found this and this tutorials especially helpful. I would suggest you to try with the very basic memory settings and optimize them later on. First check with a word count example then go on to other complex ones.

这篇关于Wordcount程序卡在hadoop-2.3.0中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆