如何在HDFS hadoop中从blockName中找到文件 [英] how to find file from blockName in HDFS hadoop

查看:377
本文介绍了如何在HDFS hadoop中从blockName中找到文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

痛苦的方式,假设你已经读取了所有文件的访问权限(并且执行目录):

$ p $ hadoop fsck / -files - 块| grep blk_520275863902385418_1002 -B 20

然后从块匹配扫描回上一个文件名:

  /hadoop/mapred/system/jobtracker.info 4个字节,1个块:OK 
0. blk_520275863902385418_1002 len = 4 repl = 1

在这种情况下blk_5202 ...是 /hadoop/mapred/system/jobtracker.info file

在编程上,这不是名称节点的接口,它允许您按块ID搜索,但可以查看次要名称节点的来源并查看它如何合并编辑 - 然后尝试从次要名称节点中保存的输出(而不是冒着处理实时名称节点文件的风险)。



祝你好运!

What's the easiest way to find file associated with a block in HDFS given a block Name/ID

解决方案

The long and painful way, assuming you have read access to all the files (and execute for the directories):

hadoop fsck / -files -blocks | grep blk_520275863902385418_1002 -B 20

Then scan back up from your block match to the previous file name:

/hadoop/mapred/system/jobtracker.info 4 bytes, 1 block(s):  OK
0. blk_520275863902385418_1002 len=4 repl=1

In this case blk_5202... is part of the /hadoop/mapred/system/jobtracker.info file

Programmatically, these isn't an interface to the name node that allows you to search by block ID, but you could look into the source for the secondary name node and see how it consolidates the edits - then experiment on the saved output from the secondary name node (rather than risking working on the live name node file).

Good luck!

这篇关于如何在HDFS hadoop中从blockName中找到文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆