HDFS存储数据的位置 [英] Where HDFS stores data

查看:700
本文介绍了HDFS存储数据的位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解hadoop在HDFS中存储数据的位置。我指的是配置文件,即: core-site.xml hdfs-site.xml

I am trying to understand where hadoop stores data in HDFS. I refer to the config files viz: core-site.xml and hdfs-site.xml

我设置的属性是:


  • core-site.xml

<property>
    <name>hadoop.tmp.dir</name>
    <value>/hadoop/tmp</value>
</property>


  • hdfs-site.xml :

    <property>
        <name>dfs.namenode.name.dir</name>
        <value>file:/hadoop/hdfs/namenode</value>
    </property>
    
    <property>
        <name>dfs.datanode.data.dir</name>
        <value>file:/hadoop/hdfs/datanode</value>
    </property>
    


  • 采用上述安排,例如 dfs.datanode.data.dir ,数据块应存储在此目录中。这是正确的吗?

    With the above arrangement, like dfs.datanode.data.dir, the data blocks should be stored in this directory. Is this correct?

    我提到了apache hadoop链接,从中我看到了:

    I referred to the apache hadoop link, and from that i see this:


    • core-default.xml hadoop.tmp.dir -> A

    hdfs-default.xml dfs。 datanode.data.dir ->确定DFS数据节点应在本地文件系统上的哪个位置存储其块。

    hdfs-default.xml dfs.datanode.data.dir --> Determines where on the local filesystem an DFS data node should store its blocks.

    此参数的默认值属性-- file:// $ {hadoop.tmp.dir} / dfs / data

    The default value for this property being -> file://${hadoop.tmp.dir}/dfs/data

    由于我明确提供了 dfs.datanode.data.dir hdfs-site.xml ),这是否意味着数据将存储在该位置?如果是这样,会将dfs / data添加到目录中的 $ {dfs.datanode.data.dir} 中,特别是它将变成-> / hadoop / hdfs / datanode / dfs / data

    Since I explicitly provided the value for dfs.datanode.data.dir (hdfs-site.xml), does it mean data would be stored in that location? If so, would dfs/data be added to the directory to ${dfs.datanode.data.dir}, specifically would it become -> /hadoop/hdfs/datanode/dfs/data?

    但是我没有看到创建此目录结构。

    However I didn't see this directory structure getting created.

    在环境中看到的一个观察结果:

    One observation that I saw in my env:

    我看到在运行一些 MapReduce 程序后,已创建此目录,即:
    / hadoop / tmp / dfs / data 正在创建。

    I saw that after I run some MapReduce programs, this directory is created viz: /hadoop/tmp/dfs/data is getting created.

    因此,不确定是否根据属性dfs.datanode.data.dir的建议将数据存储在目录中。

    So, not sure if data gets stored in the directory as suggested by the property dfs.datanode.data.dir.

    有人有类似的经历吗?

    推荐答案

    hdfs文件的数据将存储在 dfs.datanode.data.dir ,并且不会附加在默认值中看到的 / dfs / data 后缀。

    The data for hdfs files will be stored in the directory specified in dfs.datanode.data.dir, and the /dfs/data suffix that you see in the default value will not be appended.

    如果您编辑 hdfs-site.xml ,则必须重新启动 DataNode 服务使更改生效。还请记住,更改值将消除 DataNode 服务提供存储在先前位置中的块的能力。

    If you edit hdfs-site.xml, you'll have to restart the DataNode service for the change to take effect. Also remember that changing the value will eliminate the ability of the DataNode service to supply blocks that were stored in the previous location.

    最后,上面您使用 file:/ ... 而不是 file:// ... 指定值。文件URI确实需要额外的斜杠,因此可能导致这些值恢复为默认值。

    Lastly, above you have your values specified with file:/... instead of file://.... File URI's do need that extra slash, so that might be causing these values to revert to the defaults.

    这篇关于HDFS存储数据的位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆