Hadoop:...被复制到0节点而不是minReplication(= 1)。有1个数据节点正在运行,并且在此操作中不包含任何节点 [英] Hadoop: ...be replicated to 0 nodes instead of minReplication (=1). There are 1 datanode(s) running and no node(s) are excluded in this operation

查看:247
本文介绍了Hadoop:...被复制到0节点而不是minReplication(= 1)。有1个数据节点正在运行,并且在此操作中不包含任何节点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

 只能写入HDFS被复制到0个节点而不是minReplication(= 1)。有1个数据节点正在运行,并且在此操作中不包含任何节点。 

我已经在重新格式化过程中尝试了最高评分的答案,但这不适用于我: HDFS错误:只能复制到0节点,而不是1



发生的是这样的:


  1. 我的应用程序由2个线程组成,每个线程使用自己的Spring Data PartitionTextFileWriter

  2. 配置线程1是第一个处理数据并且这可以成功写入HDFS

  3. 但是,一旦线程2开始处理数据,当它尝试刷新到文件时,会收到此错误

线程1和2不会写入同一个文件,尽管它们在我的目录树的根目录共享父目录。



服务器上的磁盘空间没有问题。



我也在我的名字节点日志中看到这个,但不是su re是什么意思:

  2016-03-15 11:23:12,149 WARN org.apache.hadoop.hdfs.server。 blockmanagement.BlockPlacementPolicy:无法放置足够的副本,仍然需要1才能达到1(unavailableStorages = [],storagePolicy = BlockStoragePolicy {HOT:7,storageTypes = [DISK],creationFallbacks = [],replicationFallbacks = [ARCHIVE]}, newBlock = true)有关更多信息,请在org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy上启用DEBUG日志级别
2016-03-15 11:23:12,1 WAR WARN org.apache.hadoop.hdfs。 protocol.BlockStoragePolicy:未能放置足够的副本:预期大小为1,但只能选择0种存储类型(replication = 1,selected = [],unavailable = [DISK],removed = [DISK],policy = BlockStoragePolicy {HOT: 7,storageTypes = [DISK],creationFallbacks = [],replicationFallbacks = [ARCHIVE]})
2016-03-15 11:23:12,150 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy:Failed放置足够的副本,仍然需要1来达到1 (unavailableStorages = [DISK],storagePolicy = BlockStoragePolicy {HOT:7,storageTypes = [DISK],creationFallbacks = [],replicationFallbacks = [ARCHIVE]},newBlock = true)所有必需的存储类型不可用:unavailableStorages = [DISK], storagePolicy = BlockStoragePolicy {HOT:7,storageTypes = [DISK],creationFallbacks = [],replicationFallbacks = [ARCHIVE]}
2016-03-15 11:23:12,151 INFO org.apache.hadoop.ipc.Server: 9000上的IPC服务器处理程序8,请从10.104.247.78:52004调用org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock调用#61重试#0
java.io.IOException:File / metrics / abc / myfile只能复制到0个节点,而不是[2016-03-15 13:34:16,663]信息[代理0上的组Metadata Manager]:在1毫秒内删除0过期偏移量。 (kafka.coordinator.GroupMetadataManager)

这个错误的原因是什么?



谢谢 解决方案

这个错误是由HDFS的块复制系统引起的,不能管理在聚焦文件中制作特定块的任何副本。常见原因如下:


  1. 只有一个NameNode实例正在运行,并且它不处于安全模式

  2. 没有DataNode实例正在运行,或者有一些已经死亡。 (检查服务器)
  3. Namenode和Datanode实例都在运行,但它们不能相互通信,这意味着DataNode和NameNode实例之间存在连接问题。

  4. 由于基于hadoop的问题(检查包含datanode信息的日志)的某些网络,运行DataNode实例无法与服务器通信。
  5. 没有硬盘空间在DataNode实例或DataNode实例的配置数据目录中指定的空间不足。 (检查dfs.data.dir //删除旧文件(如果有的话))

  6. dfs.datanode.du.reserved中的DataNode实例的指定保留空间大于使DataNode实例以了解没有足够的可用空间。

  7. 没有足够的线程用于DataNode实例(检查datanode日志和dfs.datanode.handler.count值)
  8. 确保dfs.data.transfer.protection不等于身份验证,并且dfs.encrypt.data.transfer等于true。

另请:


  • 验证NameNode和DataNode服务的状态并检查相关日志
  • 验证core-site.xml是否具有正确的fs.defaultFS值,并且hdfs-site.xml具有有效值。 验证hdfs-site.xml是否具有dfs.namenode。 http-address ..用于在PHD HA配置情况下指定的所有NameNode实例。
  • 验证目录上的权限是否正确



参考: https://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo



参考: https://support.pivotal.io/hc/en-us/articles/201846688-HDFS-reports-Configured-Capacity-0-0-B-for-datanode



另外,请检查:从Java写入HDFS,只能复制到0个节点而不是minReplication


I'm getting the following error when attempting to write to HDFS as part of my multi-threaded application

could only be replicated to 0 nodes instead of minReplication (=1).  There are 1 datanode(s) running and no node(s) are excluded in this operation.

I've tried the top-rated answer here around reformatting but this doesn't work for me: HDFS error: could only be replicated to 0 nodes, instead of 1

What is happening is this:

  1. My application consists of 2 threads each one configured with their own Spring Data PartitionTextFileWriter
  2. Thread 1 is the first to process data and this can successfully write to HDFS
  3. However, once Thread 2 starts to process data I get this error when it attempts to flush to a file

Thread 1 and 2 will not be writing to the same file, although they do share a parent directory at the root of my directory tree.

There are no problems with disk space on my server.

I also see this in my name-node logs, but not sure what it means:

2016-03-15 11:23:12,149 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 1 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy
2016-03-15 11:23:12,150 WARN org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 1 but only 0 storage types can be selected (replication=1, selected=[], unavailable=[DISK], removed=[DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]})
2016-03-15 11:23:12,150 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 1 to reach 1 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable:  unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}
2016-03-15 11:23:12,151 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 9000, call org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.104.247.78:52004 Call#61 Retry#0
java.io.IOException: File /metrics/abc/myfile could only be replicated to 0 nodes instead of [2016-03-15 13:34:16,663] INFO [Group Metadata Manager on Broker 0]: Removed 0 expired offsets in 1 milliseconds. (kafka.coordinator.GroupMetadataManager)

What could be the cause of this error?

Thanks

解决方案

This error is caused by the block replication system of HDFS since it could not manage to make any copies of a specific block within the focused file. Common reasons of that:

  1. Only a NameNode instance is running and it's not in safe-mode
  2. There is no DataNode instances up and running, or some are dead. (Check the servers)
  3. Namenode and Datanode instances are both running, but they cannot communicate with each other, which means There is connectivity issue between DataNode and NameNode instances.
  4. Running DataNode instances are not able to talk to the server because of some networking of hadoop-based issues (check logs that include datanode info)
  5. There is no hard disk space specified in configured data directories for DataNode instances or DataNode instances have run out of space. (check dfs.data.dir // delete old files if any)
  6. Specified reserved spaces for DataNode instances in dfs.datanode.du.reserved is more than the free space which makes DataNode instances to understand there is no enough free space.
  7. There is no enough threads for DataNode instances (check datanode logs and dfs.datanode.handler.count value)
  8. Make sure dfs.data.transfer.protection is not equal to "authentication" and dfs.encrypt.data.transfer is equal to true.

Also please:

  • Verify the status of NameNode and DataNode services and check the related logs
  • Verify if core-site.xml has correct fs.defaultFS value and hdfs-site.xml has a valid value.
  • Verify hdfs-site.xml has dfs.namenode.http-address.. for all NameNode instances specified in case of PHD HA configuration.
  • Verify if the permissions on the directories are correct

Ref: https://wiki.apache.org/hadoop/CouldOnlyBeReplicatedTo

Ref: https://support.pivotal.io/hc/en-us/articles/201846688-HDFS-reports-Configured-Capacity-0-0-B-for-datanode

Also, please check: Writing to HDFS from Java, getting "could only be replicated to 0 nodes instead of minReplication"

这篇关于Hadoop:...被复制到0节点而不是minReplication(= 1)。有1个数据节点正在运行,并且在此操作中不包含任何节点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆