纱线hadoop 2.4.0:信息消息:ipc.Client重试连接到服务器 [英] yarn hadoop 2.4.0: info message: ipc.Client Retrying connect to server

查看:154
本文介绍了纱线hadoop 2.4.0:信息消息:ipc.Client重试连接到服务器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经搜索了两天寻求解决方案。但没有任何工作。

i've searched for two days for a solution. but nothing worked.

首先,我是整个hadoop / yarn / hdfs主题的新手,想要配置一个小群集。

First, i'm new to the whole hadoop/yarn/hdfs topic and want to configure a small cluster.

上面的消息不会显示每次我从mapreduce-examples.jar运行一个例子
有时teragen的作品,有时不是。
在某些情况下,整个工作失败,在另一些情况下,工作成功完成。

the message above doesn't show up everytime i run an example from the mapreduce-examples.jar sometimes teragen works, sometimes not. in some cases the whole job failed, in others the job finishes successfully. sometimes the job failes, without printing the message above.

14/06/08 15:42:46 INFO ipc.Client: Retrying connect to server: FQDN-HOSTNAME/XXX.XX.XX.XXX:53022. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)

此消息打印30次。每次作业开始时,端口(代码示例:53022)也会随之改变。
如果工作顺利完成,这是打印

this message is print 30 times. also the port (in code example: 53022) changes with every time a job is started. if job finished succesfuly, this is print

14/06/08 15:34:20 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=SUCCEEDED. Redirecting to job history server
14/06/08 15:34:20 INFO mapreduce.Job: Job job_1402234146062_0002 running in uber mode : false
14/06/08 15:34:20 INFO mapreduce.Job:  map 100% reduce 100%
14/06/08 15:34:20 INFO mapreduce.Job: Job job_1402234146062_0002 completed successfully

如果失败,则显示。

INFO mapreduce.Job: Job job_1402234146062_0005 failed with state FAILED due to: Task failed task_1402234146062_0005_m_000002
Job failed as tasks failed. failedMaps:1 failedReduces:0

在这种情况下,某些任务失败。但在nodemanager,datanode,resourcemanager等日志文件中,没有理由或消息可以找到。

in this case, some tasks failed. but in log files of nodemanager, datanode, resourcemanager, ... is no reason or message to find.

INFO mapreduce.Job: Task Id : attempt_1402234146062_0006_m_000002_1, Status : FAILED

有关我的配置的其他信息:
使用OS :centOS 6.5
Java版本:OpenJDK运行时环境(rhel-2.4.7.1.el6_5-x86_64 u55-b13)
OpenJDK 64位服务器虚拟机(内置24.51-b03,混合模式)

Additional Information about my Configuration: used OS: centOS 6.5 Java Version: OpenJDK Runtime Environment (rhel-2.4.7.1.el6_5-x86_64 u55-b13) OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)

yarn-site.xml

yarn-site.xml

<configuration>

<!-- Site specific YARN configuration properties -->
        <property>
                <name>yarn.nodemanager.address</name>
                <value>FQDN-HOSTNAME:8050</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
        <property>
                  <name>yarn.nodemanager.localizer.address</name>
                  <value>FQDN-HOSTNAME:8040</value>
        </property>
        <property>
                <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
                <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
                  <name>yarn.resourcemanager.resource-tracker.address</name>
                  <value>FQDN-HOSTNAME:8025</value>
        </property>
        <property>
                  <name>yarn.resourcemanager.scheduler.address</name>
                  <value>FQDN-HOSTNAME:8030</value>
        </property>
        <property>
                  <name>yarn.resourcemanager.address</name>
                  <value>FQDN-HOSTNAME:8032</value>
        </property>
</configuration>

hdfs-site.xml

hdfs-site.xml

    <configuration>
        <property>
                <name>dfs.replication</name>
                <value>2</value>
        </property>
        <property>
                   <name>dfs.permissions </name>
                   <value>false </value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>file:///var/data/hadoop/hdfs/nn</value>
        </property>
        <property>
                <name>fs.checkpoint.dir</name>
                <value>file:///var/data/hadoop/hdfs/snn</value>
        </property>
        <property>
                <name>fs.checkpoint.edits.dir</name>
                <value>file:///var/data/hadoop/hdfs/snn</value>
                <name>fs.checkpoint.edits.dir</name>
                <value>file:///var/data/hadoop/hdfs/snn</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>file:///var/data/hadoop/hdfs/dn</value>
        </property>
</configuration>

mapred-site.xml

mapred-site.xml

<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
        <property>
                <name>mapreduce.cluster.temp.dir</name>
                <value>/mapred/tempDir</value>
        </property>
        <property>
                <name>mapreduce.cluster.local.dir</name>
                <value>/mapred/localDir</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>FQDN-HOSTNAME:10020</value>
        </property>
</configuration>

我希望有人能帮助我。 :)
谢谢,
Norman

I hope somebody could help me. :) Thank you, Norman

推荐答案

工作有时会顺利完成,因为当您有一个减速器并且将偶然减少的任务发送给工作节点管理器,然后它成为成功的工作。

The job finishes sometimes successfully because when you have one reducer and that reduce task by chance is sent to a working node manager then it becomes successful job.

您必须确保 FQDN-HOSTNAME slaves 文件中的写法完全相同。如果我没有记错,我的解决方案是我删除了 / etc / hosts 中的主机名映射条目,即将其注释为:

You have to make sure that FQDN-HOSTNAME is written exactly the same way in the slaves file. If I remember correctly, my solution was that I removed the entry for the hostname mapping in /etc/hosts, that is commenting it out like this:

#127.0.0.1    FQDN-HOSTNAME

这篇关于纱线hadoop 2.4.0:信息消息:ipc.Client重试连接到服务器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆