获取可用于读写HDFS的Hadoop FileSystem对象的正确方法是什么? [英] What is the correct way to get a Hadoop FileSystem object that can be used for reading from/writing to HDFS?

查看:341
本文介绍了获取可用于读写HDFS的Hadoop FileSystem对象的正确方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

创建可用于读写HDFS的FileSystem对象的正确方法是什么?在我发现的一些例子中,他们做了这样的事情:

  final配置conf = new Configuration(); 
conf.addResource(新路径(/ usr / local / hadoop / etc / hadoop / core-site.xml));
conf.addResource(new Path(/ usr / local / hadoop / etc / hadoop / hdfs-site.xml));

final FileSystem fs = FileSystem.get(conf);

通过查看Configuration类的文档,它看起来像来自core-site.xml的属性如果该文件位于类路径中,则在创建该对象时自动加载,因此不需要再次设置。



我没有发现任何说明为什么添加hdfs-site.xml将是必需的,它似乎工作正常,没有它。



只要将core-site.xml放在类路径中,跳过hdfs-site.xml,还是应该像我在例子中看到的那样设置?在什么情况下,hdfs-site.xml中的属性是必需的?

解决方案

FileSystem 只需要一个配置密钥即可成功连接到HDFS。以前它是 fs.default.name 。从纱线之后,它被更改为 fs.defaultFS 。所以下面的代码片段就足够了。

 配置conf = new Configuration(); 
conf.set(key,hdfs:// host:port); //其中key =fs.default.name|fs.defaultFS

FileSystem fs = FileSystem.get(conf);

提示:检查核心站点。 xml 哪个键存在。在 conf 中设置与它相关的相同值。如果您运行代码的机器没有主机名称映射,请输入其IP。在 mapR 群集值中,前缀类似于 maprfs://

What is the correct way to create a FileSystem object that can be used for reading from/writing to HDFS? In some examples I've found, they do something like this:

final Configuration conf = new Configuration();
conf.addResource(new Path("/usr/local/hadoop/etc/hadoop/core-site.xml"));
conf.addResource(new Path("/usr/local/hadoop/etc/hadoop/hdfs-site.xml"));

final FileSystem fs = FileSystem.get(conf);

From looking at the documentation for the Configuration class, it looks like the properties from core-site.xml are automatically loaded when the object is created if that file is on the classpath, so there is no need to set it again.

I haven't found anything that says why adding hdfs-site.xml would be required, and it seems to work fine without it.

Would it be safe to just put core-site.xml on the classpath and skip hdfs-site.xml, or should I be setting both like I've seen in the examples? In what cases would the properties from hdfs-site.xml be required?

解决方案

FileSystem needs only one configuration key to successfully connect to HDFS. Previously it was fs.default.name. From yarn onward it's changed to fs.defaultFS. So the following snippet is sufficient for the connection.

Configuration conf = new Configuration();
conf.set(key, "hdfs://host:port");  // where key="fs.default.name"|"fs.defaultFS"

FileSystem fs = FileSystem.get(conf);       

Tip : Check the core-site.xml which key exists. Set the same value associated with it in conf. If the machine from where you are running the code doesn't have the host name mapping, put the its IP. In mapR cluster value will have prefix like maprfs://.

这篇关于获取可用于读写HDFS的Hadoop FileSystem对象的正确方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆