更新hadoop HDFS文件 [英] Updating a hadoop HDFS file
问题描述
我的问题是可以将HDFS文件加载到HBase中,进行修改并保存回到HDFS中,并删除原始文件。如果这是可行的,请让我。 解决方案
您仍然可以通过TableInputFormat和TableOutputFormat在MR作业中使用HBase表。如果你想附加数据,你可以使用任何支持hdfs append的hadoop版本,例如0.20.205.0。
I am a newbie to Hadoop. I have been reading that HDFS is mostly about "writing once, reading any times". I have a use case where I may have to make modifications to the files stored in HDFS. I have been researching if there are any ways of doing this.
My question is will it be possible to load the HDFS file into HBase, do the modifications, and save it back in HDFS, and deleting the original file. Please let me if this feasible.
If you need to update values in a file you are much better of using HBase. You can still use your HBase table in your MR jobs via the TableInputFormat and TableOutputFormat. If you want to append data you can use any of the hadoop versions that support hdfs append such as 0.20.205.0.
这篇关于更新hadoop HDFS文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!