如何通过hdfs检查分布式数据 [英] How to check the distributed data over hdfs
本文介绍了如何通过hdfs检查分布式数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
hdfs fsck / hdfs / path / to / data -files -blocks -locations
你会得到一个如下所示的报告。它报告所有块的列表,它们的复制因子以及这些块位于的主机集。
/hdfs/path/to/data/file.txt 4771082824 bytes,36 blocks(s):OK
BP-22525430-10.14.103.78-1355873316066 blk_-3400885615428218530_203522 len = 134217728 repl = 3 [10.14.103.213:50010,10.14.102.190:50010,10.14.102.176:50010]
1. BP-22525430- 10.14.103.78-1355873316066:blk_124203196739652236_203523 len = 134217728 repl = 3 [10.14.103.213:50010,10.14.102.190:50010,10.14.102.1762.50010]
2. BP-22525430-10.14.103.78-1355873316066:blk_5886188080028552249_203524 len = 134217728 repl = 3 [10.14.103.213:50010,10.14.102.190:50010,10.14.102.176:50010]
3. BP-22525430-10.14.103.78-1355873316066:blk_-3222807870390148132_203525 len = 134217728 repl = 3 [ BP-22525430-10.14.103.78-1355873316066:blk_-1285830390698132620_203526 len = 134217728 repl = 3 [10.14.103.213:50010,10.14.103.120:10010,10.14.102.176:50010] 10.14.102.190:50010,10.14.102.176:50010]
5. BP-22525430-10.14.103.78-1355873316066:blk_-2680874809037637827_203527 len = 134217728 repl = 3 [10.14.103.213:50010,10.14.102.190:50010,10.14.102.176:50010]
6. BP-22525430-10.14.103.78-1355873316066:blk_8699277646297360652_203528 len = 134217728 repl = 3 [10.14.103.213: 50010,10.14.102.190:50010,10.14.102.176:50010]
7. BP-22525430-10.14.103.78-1355873316066:blk_-2195916588803548138_203529 len = 134217728 repl = 3 [10.14.103.213:50010,10.14.102.190: 50010,10.144.102.176:50010]
[更多]
we know, Hadoop replicates the data across several data nodes in hdfs, is there a command for checking the distributed data over different nodes.
解决方案
I think you might be looking for this command
hdfs fsck /hdfs/path/to/data -files -blocks -locations
You'll get a report like the one below. It reports a list of all the blocks, their replication factor, and the set of hosts that the blocks are located on.
/hdfs/path/to/data/file.txt 4771082824 bytes, 36 block(s): OK
0. BP-22525430-10.14.103.78-1355873316066:blk_-3400885615428218530_203522 len=134217728 repl=3 [10.14.103.213:50010, 10.14.102.190:50010, 10.14.102.176:50010]
1. BP-22525430-10.14.103.78-1355873316066:blk_124203196739652236_203523 len=134217728 repl=3 [10.14.103.213:50010, 10.14.102.190:50010, 10.14.102.176:50010]
2. BP-22525430-10.14.103.78-1355873316066:blk_5886188080028552249_203524 len=134217728 repl=3 [10.14.103.213:50010, 10.14.102.190:50010, 10.14.102.176:50010]
3. BP-22525430-10.14.103.78-1355873316066:blk_-3222807870390148132_203525 len=134217728 repl=3 [10.14.103.213:50010, 10.14.102.190:50010, 10.14.102.176:50010]
4. BP-22525430-10.14.103.78-1355873316066:blk_-1285830390698132620_203526 len=134217728 repl=3 [10.14.103.213:50010, 10.14.102.190:50010, 10.14.102.176:50010]
5. BP-22525430-10.14.103.78-1355873316066:blk_-2680874809037637827_203527 len=134217728 repl=3 [10.14.103.213:50010, 10.14.102.190:50010, 10.14.102.176:50010]
6. BP-22525430-10.14.103.78-1355873316066:blk_8699277646297360652_203528 len=134217728 repl=3 [10.14.103.213:50010, 10.14.102.190:50010, 10.14.102.176:50010]
7. BP-22525430-10.14.103.78-1355873316066:blk_-2195916588803548138_203529 len=134217728 repl=3 [10.14.103.213:50010, 10.14.102.190:50010, 10.14.102.176:50010]
[more]
这篇关于如何通过hdfs检查分布式数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文