Hadoop HDFS - 无法连接到主站上的端口 [英] Hadoop HDFS - Cannot connect to port on master

查看：178 发布时间：2018/5/31 18:34:24 networking hadoop port hdfs

本文介绍了Hadoop HDFS - 无法连接到主站上的端口的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我建立了一个小型Hadoop集群进行测试。 NameNode（1台机器），SecondaryNameNode（1）和所有DataNode（3）的安装程序相当顺利。这些机器被命名为master，secondary和data01，data02和data03。所有的DNS都已正确设置，并且从主/辅助设备向所有机器配置了无密码SSH并返回。

我使用 bin / hadoop namenode -format 格式化群集，然后使用<$ c $启动所有服务C>仓/ start-all.sh 。所有节点上的所有进程都被检查启动并运行 jps 。我的基本配置文件如下所示：

 <！ -  conf / core-site.xml  - > 
<配置> 
<属性> 
<名称> fs.default.name< /名称> 
 <！ -  
在主机上它是本地主机
对其他主机的DNS 
（ping可以在任何地方使用）
  - > 
< value> hdfs：// localhost：9000< / value> 
< / property> 
<属性> 
< name> hadoop.tmp.dir< / name> 
 <！ - 我为root FS选取/ hdfs  - > 
< value> / hdfs / tmp< / value> 
< / property> 
< / configuration> 
 
<！ -  conf / hdfs-site.xml  - > 
<配置> 
<属性> 
<名称> dfs.name.dir< /名称> 
< value> / hdfs / name< / value> 
< / property> 
<属性> 
<名称> dfs.data.dir< /名称> 
< value> / hdfs / data< / value> 
< / property> 
<属性> 
< name> dfs.replication< / name> 
<值> 3< /值> 
< / property> 
< / configuration> 
 
＃conf / master 
 secondary 
 
＃conf / slaves 
 data01 
 data02 
 data03

我只是想让HDFS正常运行。

我创建了一个dir来测试 hadoop fs -mkdir testing ，然后试着用 hadoop fs -copyFromLocal / tmp / *复制一些文件.txt测试。这是当hadoop崩溃，给我多少这个： WARN hdfs.DFSClient：DataStreamer异常：org.apache.hadoop .ipc.RemoteException：java.io.IOException：文件/user/hd/testing/wordcount1.txt只能复制到0个节点，而不是1 在...（诸如此类） WARN hdfs.DFSClient：错误恢复block null bad datanode [0]节点== null at ... WARN hdfs.DFSClient：无法获取块位置。源文件/user/hd/testing/wordcount1.txt - 正在中止... 在... 错误hdfs.DFSClient：异常关闭文件/ user / hd / testing / wordcount1.txt：org.apache.hadoop.ipc.RemoteException：java.io.IOException：文件/user/hd/testing/wordcount1.txt只能复制到0个节点，而不是1 at ... 等等。当我尝试从DataNode机器运行 hadoop fs -lsr。时，出现类似问题，只能得到以下结果： 2002年12月1日10:02:11信息ipc.Client：重试connt到服务器主/ 192.162.10.10：9000。已经尝试0次（s）。 12/01/02 10:02:12信息ipc.Client：重试connt到服务器主/ 192.162.10.10：9000。已经尝试过1次。 12/01/02 10:02:13信息ipc.Client：重试connt到服务器主/ 192.162.10.10：9000。已经尝试过2次。 ... 我说它是相似的，因为我怀疑这是一个端口可用性问题。运行 telnet master 9000 显示该端口已关闭。我在某处读过这可能是IPv6冲突问题，因此在conf / hadoop-env.sh中定义了以下内容： export HADOOP_OPTS = -Djava.net.preferIPv4Stack = true 但是这并没有做到这一点。在主服务器上运行 netstat 会显示如下所示： Proto Recv- Q发送Q本地地址外部地址状态 tcp 0 0 localhost：9000 localhost：56387 ESTABLISHED tcp 0 0 localhost：56386 localhost：9000 TIME_WAIT tcp 0 0 localhost：56387 localhost：9000建立 tcp 0 0 localhost：56384 localhost：9000 TIME_WAIT tcp 0 0 localhost：56385 localhost：9000 TIME_WAIT tcp 0 0 localhost：56383 localhost：9000 TIME_WAIT 此时我很确定问题出在端口（9000）上，但我不确定我错过了什么就像配置一样。有任何想法吗？ update 我发现硬编码DNS名称为 / etc /主机不仅可以帮助解决这个问题，还可以加快连接速度。缺点是您必须在集群中的所有机器上执行此操作，并在添加新节点时再次执行此操作。或者你可以设置一个DNS服务器，我没有。下面是我的集群中的一个节点的示例（节点名为 hadoop01 ， hadoop02 等，其中主控和辅助控制为01和02）。节点，大部分是由操作系统生成的：＃这是带有dns hadoop01 的机器的示例:: 1 localhost ip6-localhost ip6-loopback fe00 :: 0 ip6-localnet ff00 :: 0 ip6-mcastrprefix ff02 :: 1 ip6-allnodes ff02 :: 2 ip6-allroutes ＃---节点的启动列表 192.168.10.101 hadoop01 192.168.10.102 hadoop02 192.168.10.103 hadoop03 192.168。 10.104 hadoop04 192.168.10.105 hadoop05 192.168.10.106 hadoop06 192.168.10.107 hadoop07 192.168.10.108 hadoop08 192.168.10.109 hadoop09 192.168.10.110 hadoop10 ＃...等等＃---结束节点列表＃自动生成的主机名。请不要删除此评论。 127.0.0.1 hadoop01 localhost localhost.localdomain 希望这有助于您。解决方案当存在远程时，将hdfs：// localhost：9000中的localhost替换为NameNode中fs.default.name属性的ip-address或hostname节点连接到NameNode。 lockquote 所有节点上的所有进程都被检查为启动并运行 jps code> 日志文件中可能存在一些错误。 jps确保进程正在运行。 I've set up a small Hadoop cluster for testing. Setup went fairly well with the NameNode (1 machine), SecondaryNameNode (1) and all DataNodes (3). The machines are named "master", "secondary" and "data01", "data02" and "data03". All DNS are properly set up, and passwordless SSH was configured from master/secondary to all machines and back. I formatted the cluster with bin/hadoop namenode -format, and then started all services using bin/start-all.sh. All processes on all nodes were checked to be up and running with jps. My basic configuration files look something like this:  <configuration> <property> <name>fs.default.name</name>  <value>hdfs://localhost:9000</value> </property> <property> <name>hadoop.tmp.dir</name>  <value>/hdfs/tmp</value> </property> </configuration>  <configuration> <property> <name>dfs.name.dir</name> <value>/hdfs/name</value> </property> <property> <name>dfs.data.dir</name> <value>/hdfs/data</value> </property> <property> <name>dfs.replication</name> <value>3</value> </property> </configuration> # conf/masters secondary # conf/slaves data01 data02 data03 I'm just trying to get HDFS running properly now. I've created a dir for testing hadoop fs -mkdir testing, then tried to copy some files into it with hadoop fs -copyFromLocal /tmp/*.txt testing. This is when hadoop crashes, giving me more or less this: WARN hdfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hd/testing/wordcount1.txt could only be replicated to 0 nodes, instead of 1 at ... (such and such) WARN hdfs.DFSClient: Error Recovery for block null bad datanode[0] nodes == null at ... WARN hdfs.DFSClient: Could not get block locations. Source file "/user/hd/testing/wordcount1.txt" - Aborting... at ... ERROR hdfs.DFSClient: Exception closing file /user/hd/testing/wordcount1.txt: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /user/hd/testing/wordcount1.txt could only be replicated to 0 nodes, instead of 1 at ... And so on. A similar issue occurs when I try to run hadoop fs -lsr . from a DataNode machine, only to get the following: 12/01/02 10:02:11 INFO ipc.Client: Retrying connt to server master/192.162.10.10:9000. Already tried 0 time(s). 12/01/02 10:02:12 INFO ipc.Client: Retrying connt to server master/192.162.10.10:9000. Already tried 1 time(s). 12/01/02 10:02:13 INFO ipc.Client: Retrying connt to server master/192.162.10.10:9000. Already tried 2 time(s). ... I'm saying it's similar, because I suspect this is a port availability issue. Running telnet master 9000 reveals that the port is closed. I've read somewhere that this might be an IPv6 clash issue, and thus defined the following in conf/hadoop-env.sh: export HADOOP_OPTS=-Djava.net.preferIPv4Stack=true But that didn't do the trick. Running netstat on the master reveals something like this: Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 localhost:9000 localhost:56387 ESTABLISHED tcp 0 0 localhost:56386 localhost:9000 TIME_WAIT tcp 0 0 localhost:56387 localhost:9000 ESTABLISHED tcp 0 0 localhost:56384 localhost:9000 TIME_WAIT tcp 0 0 localhost:56385 localhost:9000 TIME_WAIT tcp 0 0 localhost:56383 localhost:9000 TIME_WAIT At this point I'm pretty sure the problem is with the port (9000), but I'm not sure what I missed as far as configuration goes. Any ideas? Thanks. update I found that hard coding DNS names into /etc/hosts not only help resolve this, but also speeds up the connections. The downside is that you have to do this on all the machines in the cluster, and again when you add new nodes. Or you can just set up a DNS server, which I didn't. Here's a sample of my one node in my cluster (nodes are named hadoop01, hadoop02, etc, with the master and secondary being 01 and 02). Node that most of it are generated by the OS: # this is a sample for a machine with dns hadoop01 ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastrprefix ff02::1 ip6-allnodes ff02::2 ip6-allroutes # --- Start list of nodes 192.168.10.101 hadoop01 192.168.10.102 hadoop02 192.168.10.103 hadoop03 192.168.10.104 hadoop04 192.168.10.105 hadoop05 192.168.10.106 hadoop06 192.168.10.107 hadoop07 192.168.10.108 hadoop08 192.168.10.109 hadoop09 192.168.10.110 hadoop10 # ... and so on # --- End list of nodes # Auto-generated hostname. Please do not remove this comment. 127.0.0.1 hadoop01 localhost localhost.localdomain Hope this helps. 解决方案 Replace localhost in hdfs://localhost:9000 with ip-address or hostname for the fs.default.name property in NameNode when there are remote nodes connecting to the NameNode. All processes on all nodes were checked to be up and running with jps There might be some errors in the log files. jps makes sure that the process is running. 这篇关于Hadoop HDFS - 无法连接到主站上的端口的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Hadoop HDFS - 无法连接到主站上的端口 [英] Hadoop HDFS - Cannot connect to port on master

问题描述

update

update

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录关闭

Hadoop HDFS - 无法连接到主站上的端口 [英] Hadoop HDFS - Cannot connect to port on master

问题描述

update

update

相关文章

分布式计算/Hadoop最新文章

热门教程

热门工具

登录 关闭

登录关闭